


ANNALS OF MATHEMATICS 
Vol. 36, No. 3, July, 1935 


GEODESIC FIELDS IN THE CALCULUS OF VARIATION FOR 
MULTIPLE INTEGRALS 


By Hermann WEYL 


(Received April 15, 1935) 


Introduction 


Carathéodory recently drew my attention to an “independent integral’’ in the 
calculus of variation for several variables exhibited by him in an important 
paper in 1929,! and he asked me about its relation to a different independent 
integral I made use of in a brief exposition of the same subject in the Physical 
Review, 1934.2. The present note was drafted to meet Carathéodory’s question 
(§11). To facilitate comparison I first serve my own dish again in Carathéodory 
style (trace theory, Part 1) and then expound the essentials of his theory 
(Part 2); the link between them thereby becomes fairly obvious. In Part 3 I 
consider the approximation known as the second variation. Thus the whole 
formal apparatus of the calculus of variation—Lagrange’s equation, Legendre’s, 
Jacobi’s, Weierstrass’ conditions and Hilbert’s independent integral—will be 
found in these three Parts, packed together in a nutshell as it were. Chapter 4 
solves the problem of embedding a given extremal in a geodesic slope field—this 
notion taken in the sense of the trace theory.’ The reader who does not care 
for technical details but wants the lucid simplicity of the general foundations 
not to be marred by toilsome existential considerations is warned to ignore 
this last Part. 


Part 1. The linear trace theory 


§1. The problem of variation. v functions of r real variables ¢’, 
(1) z* = 2%(f!... é7), (# ... ) in G, (a =1,---,») 


describe an r-dimensional “surface” = in the (r + v)-dimensional t-z-space 
covering a given region G of the t-space. We consider only surfaces 2 lying in 
a certain domain Q of the ¢-z-space which have their boundary in common; 
ve the values of the functions (1) at the boundary of G are prescribed once 
or all. 





' Acta litt. ac scient. univers. Hungaricae, Szeged, Sect. Math., 4, 1929, p. 193. 

* Physical Review 46, 1934, p. 505. 

3 Prof, Carathéodory advises me that Mr. Boerner did the same for his more sophisti- 
cated theory, 


€07 











Sign SPS 


ates 


ee ee cee 


+ na ee 
S Pe et 
pene a See *. iS a 





Pe 


eee 
















nm . 
cos 
sso. 


608 HERMANN WEYL 


The situation in the calculus of variations with r independent variables 
ti (¢ = 1, --- , r) is this: A function L of the variables 


*, 8°, £3 (a=1,---,v;t=1,--.,7) 


is given. By appropriate choice of = one tries to minimize the integral 
(2) J=J(2)= [u, z(t), dz* /dt‘) dt! .-. dt’. 
G 


§2. Three stages of independent variables. v-r functions z{(¢, z) in Q define 
what we call a slope field § in 2. The surface =, (1), is embedded in the slope 
field § if 


dz*/dt' = 27 (t*, 28(t)) 


holds. 

We distinguish three standpoints concerning the arguments in our functions: 
(1) t', z*, z{ are taken as independent variables, as for instance in the func- 
tion ZL. The derivatives with respect to these variables are marked by attaching 
the respective variable as an index. (2) By using a given slope field %, the 
z= are replaced by functions of the ¢‘ and z*. The partial derivatives with 
respect to the arguments ¢‘ and z* are then denoted by @/at‘, 8/az*. (3) The 
substitution 


zt = 201... f) {2% = dz*/dt'} 


referring to a given surface = changes functions which appeared in the second 
(or the first) standpoint, into functions of the ¢ alone. Their derivation with 
respect to t‘ is denoted by d/dt‘. 

In keeping with these conventions and the further one that one always has 
to sum over two-fold occurring indices, the vanishing of the first variation: 
6J = 0, is expressed by Euler’s equations 


(3) a a? oe | 


= is called an extremal when satisfying these relations. The arguments in 
Lo and L,a are t', 2*, dz*/dt'. 


§3. Lagrangian of the divergence type. One may form functions L for which 
the integral (2) is independent of 2 by the following method. Let 
(4) si(t, z) (¢ =1,---,7) 


be given functions in 2. After substituting the functions (1) for the argu- 
ments z we consider the divergence 

dst _ ast | asi des 

dti att az dt" 











GEODESIC FIELDS 609 


Its integral is the flux of the vector field s‘(t, z(t)) through the boundary of G 
and therefore depends on the values of z(t) at the border of G only. Hence 


we choose as our L: 


; ds' as* 
Dit, 2°, 27) = —  % 2° } 
(5) at* + 02% 


All surfaces = are extremals of this Lagrangian, which is linear in 2?. 


§4. Geodesic field and independent integral. Let L be a given Lagrangian 
and let us now suppose we succeeded in determining our functions (4) and the 
slope field z{(¢, 2) such that 


(6) L =D, Le = D,« for 2¢ = z(t, 2). 


A slope field of this kind may be called geodesic. We notice in passing that 
D,« = 0s‘/dz* does not contain the variables zf. For the “momenta” L,« we 


often use the abbreviations pi. As 
Dt, z*, dz*/dt*) = D(t', 2%, 27) + or <(¢- :t), 
its integral (the independent integral) under these circumstances changes into 
( W = WO) = [ {L + pilet— et)la. 
A surface integral like 
[ Fe, z*, 29, 25) dé 
z= 

is always to be interpreted as meaning 

i F(ti, z(t), z¢(t, 2(t)), dz*/dt*) . dt! .-. dt”. 
The arguments of the functions L and p; in (7) are t', 2%, 27. 


A surface = embedded in our geodesic slope field is of necessity an extremal for 
the Lagrangian L. Indeed, on account of 
aL az" aL _,, ae 


pp he he elem lee 





one can supplement the equations (6) by: 


La =D, for z2¢ = 2[(t, 2). 


Hence the identity 
dD,a/dt' — Da = 0, 



























610 HERMANN WEYL 


which is satisfied for every surface, leads for a surface 2 embedded in our 
geodesic field to (3). 


In the case r = 1, »v = 1 the independent integral (7) was first propounded 
by Hilbert. 

§5. Legendre transformation. The equations (6) with the definition (5) of 
D are equivalent to 


ds? as* 
—_ == . — L om iy . 
az Pas até Pp 


(8) 
We therefore have to introduce into the function 
H=L— pit 

the momenta 
(9) Pa = L,a 
instead of the z7 as independent variables: 
(10) H = H(t', z+, pi) 
(Legendre’s transformation). The total differential 

bL = L, oti + L,.62* + piiz? 
leads at once to 


(11) 6H = L,; éti + L,.62* — 2765p} ; 


thus one gets 


as the converse of the equations (9). In order to construct a geodesic field one 
has to solve the one Jacobi-Hamilton differential equation 
as : ; 
(12) oi = H(ti, 2, as‘/az*) ; 
the formula 
z; = —H,i (t, z, ds/dz) 


then furnishes the geodesic field. 

By the way, the equation (12) can be formulated in such a manner that it 
does not involve any derivatives with respect to the ¢’s: the integral of 
H(t‘, z*, ds‘/dz*) over an arbitrary part V of the region G in ¢-space is equal 
to the flux of the vector-field s‘ through the boundary of V. 





* With such limitations as to the spread of V, of course, as are necessary for this state- 
ment to make sense: V = V, has to be such for a given point (z) that all points (t, z) lie 
in 2 when (t) lies in V. 





GEODESIC FIELDS 611 


§6. Weierstrass’ formula. A surface 2 embedded in our geodesic field § 
is extremal, and the integral W(z), (7), coincides with J(2) for this surface. 
Let us then suppose we have an extremal 2: 


(13) zt = 22({! ... @), (t!..- ¢)inG, 


lying in a region © of ¢t-z-space and embedded in a geodesic field § that 
covers 2. We compare 2» with other surfaces 2, (1), in 2 of the same boundary. 
Using the notations 


J(2) = ds J (20) = Jo; W (2) => W, W (0) = Wo; AJ =J — Jo 
we have 
AJ = AV — W) = (WJ — W) — (Jo — Wo), 


because of the independence of W, and furthermore J) = Wo, because of the 
embedding of 2» in §. In this simple fashion we arrive at Weierstrass’ formula 


(14) Pv ee ee am [ xe 2%: 2%, 3%) dt, 
(15) E(t’, 2"; 29, 29) = (L(t, 2%, 27) — L(t, 2*, 27)] — Lee (27 — 24), 


the clue to which is the fact that the difference J — Jo is expressed by a single 
integral extending over X; YX» has mysteriously been juggled out. In (15) L, 


depends on the arguments f¢‘, z*, 27. 

One may say that the method consists in replacing L by L — D, subtracting 
a suitable D of the type (5) from L; this process does not change the extremals 
of L. The “suitable” choice of D is effected by solving the Jacobi-Hamilton 
equation (12). 

Sufficient for a (‘strong’) minimum is the positive-definite character of Weier- 
strass’ E-function: 


(16) E(t’, 2°; 27, 27) 20. 


Here the 2{ range independently over all values from — ~ to + ©, 27 = 2%(t, 2) 
are the slope functions of the embedding geodesic field and the point (t, z) 
varies in a region Q surrounding the extremal Y» in the ¢-z-space. The ezx- 
istence of such a field is an integral part of Weierstrass’ criterion. 


§7. Invariance. The t‘ may be subjected to an arbitrary transformation 
among themselves. We might even replace the region G of the t-space by an 
arbitrary r-dimensional manifold G only parts of which can be referred to 
coordinates ¢!, -.. , ¢. The r quantities 2¢ (¢ = 1, --- , r) are to be treated as 
components of a covariant vector (with respect to the Latin indices, matched 
with the variables t*). The Lagrangian L is to be transformed as a scalar 
density (of weight 1) ie. it is to be multiplied with the absolute value of the 
functional determinant of the transformation of the t. The integral J(Z) then 











A ae Rd? RE 


«i 
: k 
Sy 
oe 
a 
eee 6 
Nie i 

i 
i 
ee 
2 
is 



















612 HERMANN WEYL 


has an invariant significance—even when the whole G is not coverable by a 
single codrdinate system ¢. Covariance and contravariance are designated by 
the position of the indices in the usual way. Some of the quantities, in par- 
ticular L, s‘, pi, H, E, are densities in the sense just described; I would have 
denoted them by German letters in accordance with the usage in my book 
“Raum, Zeit, Materie,’’ had I not to reckon with the Anglo-Saxon aversion to 
these types. 

It is conceptually simpler to take as the realm of integration G the whole 
manifold, not a finite portion (= compact subset) thereof. We then must 
replace the boundary condition for 2 by the requirement that = coincides with 
the standard extremal =» outside a sufficiently large finite portion of G (depend- 
ent on 2). Under these circumstances, the difference AJ, as its integrand 
vanishes outside that finite region, has a meaning (though not the integral J (2) 
itself). 

At a higher standpoint of invariance the dependent variables z* may be 
included in the transformations. But in contrast to the ¢' they should not be 
looked upon as a separate set in the row of r + v variables f', --- , t,2', --- , 2; 
we have the case of “reduction,” not of “decomposition.” The situation pre- 
vailing can be described in this way. An (r + v)-dimensional manifold Q is 
mapped upon the r-dimensional manifold G; this mapping, called the projection, 
is given once for all. Thus G may be considered as the manifold arising from 2 
by identifying points » in Q with the same projection ¢. The coérdinates 
ti, ... ,é, z!, --- , 2” covering a part of 2 are subject to the restriction that the 
coérdinates ¢!, --- , /” have the same values at points w with the same projec- 
tion, but all transformations in agreement with this requirement are admis- 
sible. 2 is a mapping of G in Q: t > w such that the image w of ¢ has ¢ as its 
projection. The behavior of all our quantities could be easily discussed under 
this wider aspect of invariance; but I do not wish to dwell upon it here. 


Part 2. Carathéodory’s determinant theory and its relation to the trace theory 


§8. Lagrangian of the determinant type. Carathéodory uses a different in- 
dependent integral. He too starts with r functions 


(17) S*(, 2) 


from which he forms, with reference to a given surface 2, (1), instead of the 
divergence (5), the functional determinant 


asi, 20) dS! _ aS! , aSt des 
dt* ; oe. ew.. 
Its integral over G is independent of 2, as long as the boundary of 2 is pre- 


served; for it gives the volume in the A-space upon which the region G in the 
t-space is mapped by the transformation 


Si(t, 2(t)) = ri. 





(18) 








GEODESIC FIELDS 613 


In accordance with the new formation (18) we now take 


as as 2° 
at®* * azz ~* |" 





(19) D(z?) = 


This Lagrangian too has the property of possessing all surfaces as its extremals. 
Since D is not linear in the arguments z{—the only ones we put in evidence— 
one needs a little algebraic computation to compare D(z‘) for two sets of values 
2°: D(z?) = D and D(2%). 


§9. An algebraic identity. D = D(z{) is the determinant of certain quan- 
tities of form 
Si = si + oizf. 
The element si + 04 2% of the second determinant D(2%) can be written as 
Si + oauz, 


the uf being the differences 2 — z{. Application of the multiplication theorem 
of determinants readily leads to the formula 


| Si + odug | =| Si] - | di + wivt | 


where the 7; are determined by the equations 


(20) iri aol. 

[ maintain that 

(21) wm, = D,a/D. 

Indeed, let || Tj, || be the inverse matrix of || Sj ||. The general formula 


dD/D = T*dS;,, 
when applied to derivation with respect to z7, yields 
De/ D= To > ’ 
and this shows exactly that (21) are the solutions of the equations (20). Hence 
the following identity obtains 
(22) D(z) = D(ei) - |; + (Dex/D)(2i — zi) |. 


§10. Geodesic field and independent integral once more. When the func- 
tions 


Si(t,z), z7(t, z) 
are such that 


pa = D, Lia = Dia for ze = z<(t, z), 











~ Se | ae Se Coe 





een eee le 





















614 HERMANN WEYL 


Carathéodory calls the slope field 2[(t, 2) geodesic. In a geodesic field (22) 
changes into 


D(z?) = L- |8§ + (Lra/L)(2 — 2) |. 


The arguments of L and L,« = pj are here: ¢, z%, z¢(t, z). The independent 


integral takes on the form 
Wee) = | L- |ot + (ag/Lo(ee — a8) | at 


All further developments follow the same line as in Part 1. The differential 
equation, though, imposed on S‘ by the requirement that the slope field 27 (t, z) 
be geodesic is essentially more complicated. The réle of the Hamiltonian H in 
Part 1 is taken over by the determinant 
1 


8: — 7 Dare 


L. L 








The theory will work only if this function as well as Z are of constant sign in 
the region to be considered. 


§11. Mutual relationship of the two independent integrals. The relation 
between the two competing theories of Parts 1 and 2 which serve the same 
end is now fairly obvious. They do not differ in the case of only one vari- 
able ¢t. In the general case, the extremals for the Lagrangian L are the 
same as for L* = 1 + eL, e being anarbitrary constant. Notwithstanding, 
Carathéodory’s theory is not linear with respect to L. But applying it to 
1 + eZ instead of L and then letting ¢ tend to zero, we fall back on the linear 
theory of Part 1. One has to choose Carathéodory’s functions 


Sit, z) = t' + e-si(t, z). 


Neglecting quantities that tend to zero with e more strongly than e« itself, one 
then gets 


as ds 
dt* dt' 


or Carathéodory’s D*, (19), becomes = 1 + eD where D has the significance 
(5) of Part 1. One may therefore describe Carathéodory’s theory as a finite 
determinant theory and the simpler one of Part 1 as the corresponding infinitesi- 
mal trace theory. 

The Carathéodory theory is invariant when the S‘ are considered as scalars 
not affected by the transformations of ¢. It appears unsatisfactory that the 
transition here sketched, by introducing the density 1 relatively to the coérdi- 
nates t‘, breaks the invariant character. This however is related to the existence 
of a distinguished system of coérdinates t‘ in the determinant theory, consisting 


=l+e 








GEODESIC FIELDS 615 


of the functions S(t, 2(t)). This remark reveals at the same time that, in con- 
trast to the trace theory, it is not capable of being carried through without 
singularities on a manifold G that cannot be covered by a single coérdinate 


system ¢. 


§12. Special extremal slope fields. Returning, for the rest of the paper, 
to the theory of Part 1, we keep to the definitions and notations explained 
there. In my article in the Physical Review I viewed the problem from a 
slightly different angle. One is accustomed, in the classical case of one variable ¢ 
and one unknown 2, to perform the embedding by means of a field of extremals. 
I therefore started with a field of extremal surfaces simply covering 2, and I 
introduced the gradient 


(23) dz*/dt' = z(t, z) 


of the field surface passing through (¢, z). Such a gradient field of extremals is, 
according to (3), characterized by the relations 





(24) at ar 


aL,« aL,¢ 
( a Fat) — Le = 0. 
Conversely, if one is given the slope field z{(t, z) arbitrarily, one can find a 
corresponding field of surfaces provided equations (23) are completely integrable, 
the conditions of integrability being 


az; (Oz Oz, 6 O25  p\ _ 
(F - )+ (Bt - 3a iil 


I proposed to call a slope field z7(t, z) satisfying the equations (24) an extremal 
slope field whether it be integrable or not. 

With respect to a Lagrangian D of the special form (5) not only is every 
surface an extremal, but every slope field is an extremal field. This is an 
immediate consequence of the fact that D is linear in z{ as we shall see at once. 
Therefore our geodesic field must needs be an extremal field for L; on account 
of p, = 4s‘/dz* it satisfies the conditions 





(25) Opa _ OP, _ 
az® az 


For this reason I conceived the geodesic fields in the Physical Review as “special 
extremal slope fields’; and thus the essential modifications imposed upon the 
classical concept of an extremal field appeared as dropping off integrability and 
replacing it by the new conditions (25). For Carathéodory’s D, however, it is 
not true at all, that every slope field is extremal—notwithstanding the fact that 
all surfaces are extremals of D. This robs the notion of a special extremal field 
of its primary importance for our present purpose. 








ats 


a teas ide 


aa 


em 


Bae | OR rate 2 ae 


= 


Ya . : 


~~, 


Ve 


* 
cad 
ip 
‘A 
, 
ee 





Ree 


SR ETE, 


_ 


Se 


preys, 


ov, 














616 HERMANN WEYL 


In order to justify our assertion that the left side of (24) vanishes identically 
for L = D, (5), one merely needs to observe that it does not contain the deriva- 
tives of z{(t, z) since D,« = ds‘/dz* does not contain the variables z{. A sur- 


face z*(t) may be chosen such that z*(¢), dz*/dt‘ have arbitrarily given values at 
one specific point t. Hence our statement is evident from the fact that every 
surface is extremal for D. He who is not afraid of a simple calculation could 
verify the averred identical vanishing at once. 


Part 3. Second variation 


§13. Legendre’s quadratic form. Let us consider the Weierstrass E-func- 
tion for definite values of t‘, z*, z and expand it into a power series in terms of 
the variables uf = 27 — z{. The expression (15) shows that the constant and 
linear terms are missing and the development starts with the quadratic term: 


(26) b La 8 UUs = 3 Flt, 2%, 24 |v). 


It should not go unnoticed that the discriminant of this quadratic form in the 
u’s is that determinant whose non-vanishing makes possible the solving of the 
equations (9) for z?. Our form F when taken on 2%, i.e. for 


zt = 3(t), 2% = d3/dti, 


may be designated by F,(t| u). The positive definite character of the quad- 
ratic form F,(¢ | u)—for every (t) in G—is, as is seen from this whole develop- 
ment, a sufficient condition for a “‘weak minimum’’ (Legendre’s condition). 

Whereas Weierstrass’ condition refers explicitly to an embedding geodesic 
field, Legendre’s condition does not. Does it therefore guarantee a weak mini- 
mum without assuming the existence of an embedding geodesic field? No, that 
is exactly where Legendre was wrong. But only the approximate geodesic field 
(Jacobi’s condition) enters into the proof of Legendre’s criterion—approximate 
to the same degree as (26) approximates the E-function. Legendre’s stunt of 
subtracting a divergence 


d ¢i 
ai (83 g52%6z°) 


from the integrand of the second variation 6J is exactly the same procedure 
for that infinitesimal variation as the Weierstrass-Hilbert-Carathéodory method 
of subtracting a D from L with respect to the finite “variation” AJ. 


§14. Trivial preparations for solving the problem of embedding. This coin- 
cidence will become clearer when we now attack the problem of embedding a 
given extremal 


Dor 2% = 22({1... #*) 








_ 


GEODESIC FIELDS 617 


in a geodesic slope field. We have to construct a solution s‘ of (12) such that 
asi/az* reduces to p,(t) for z* = Z(t). Here let p2(t) be the value of pi = L,« 


for 2* = 2%(t), 2¢ = d2*/dt‘, so that we have conversely 


dis/dt = — H, s(t, 2(t), p(0) 


>, being an extremal, the equation 
dpi /dt' = H,a(t, 2(t), p(t) 
obtains [observe that H,. = L,2, because of (11)]. We rid ourselves of the con- 
stant and linear terms in s*‘ and H in the following simple way. 
Writing z(t) + z*, pa(t) + po instead of z* and p;, we put 
si(t, 2(t) + z) — sit, (0) = pilt) 2 + c(t, z) ; 


H(t, £1) + 2, Bt) + p) — H(t, HO, BO) = Be ee —F yi 4 HH, 2, p). 





The differential equation (12) now changes into 


da' : dc 
— = H* t', 2, — 
ati ( ih ‘), 


and the initial conditions 


= = pi(t) for z= 2t) 
into 
ar for 2=---=2”?=0. 
0z* 


The Taylor expansions of o‘(t, z) and H*(t, z, p) in terms of z or z, p respectively, 
contain no constant and linear terms. Restoring our original notations s and H 
instead of ¢ and H* we thus have shown that we may put, without any loss of 
generality: 2*(t) = 0, pi(t) = 0. 


§15. First approximation: Legendre’s differential equations. When limit- 
ing H to its quadratic term 


(27) H, = 4 Aapz72* + Af spaz? +34 ATED ads 
the quadratic part of s‘: 
(28) 3 si gz%z? 


provides an exact solution of the Jacobi-Hamilton differential equation. The 
coefficients A and Sip are functions of ¢ only and are, of course, written in sym- 
metrical fashion: 


B gee 
Aas wed Aga; Att _ Abi, Sap = Spa: 











—— a - ww dt. tf Riese os 
wr iy a nS iy Seige oS "i 
2 wR. Top: RO" geo ores Se aes coe 





Rae Se, 
aS gow 


As ogg State ee 
£23 See a 


page Fs 





Nogh SR 6 petal eRe 


~ aes eee 
We RR sack 
on 














618 HERMANN WEYL 


(12) yields the following system of differential equations for the unknown s;,: 


dsis 
dt’ 


The transformation character is indicated again by the position of the indices; 
it should be added that the three A’s on the right side are densities of weight 
+1,0, —1lrespectively. A solution of these Legendre equations furnishes what 
may properly be called an approximate geodesic field. (Legendre’s method as 
he applied it to the second variation would lead exactly to the same result.) 

Whereas Legendre’s condition is only a part of the much stronger Weierstrass 
condition, it is to be guessed that the existence of a geodesic field in the approxi- 
mate sense of the ‘‘second variation,” implies its existence in the exact sense. 
Our conjecture will be proved in the last Chapter. The result is two-fold: 

1) The embedding of 2» by a geodesic slope field is always locally possible. 
This suffices for answering all questions about local minima (when only sur- 
faces = are admitted to competition that differ from 2 in a small enough 
neighborhood of a point £). 

2) The embedding goes through, even in the large, for the whole extremal 3), 
provided the first approximation, the solution of Legendre’s equations, can be 
effected. 


(29) = Aug + Algsig t+ Atisiad oa 





§16. Appendix: Necessary Local Conditions. Let us consider the extremal 
Xo: z2* = O in the neighborhood of a given point ¢' = ¢} and denote the 
E-function at that point of Yo, namely E(t, 0; 0, uf) by Eo(uf). One gets a 
necessary local condition for a strong minimum by putting a little cone-shaped 
hood on 2». Its basis may be defined by f(r! --- 7”) S 1 in terms of the relative 
codrdinates 7‘: t' = tj + er‘; here ¢ is a positive constant doomed to approach 
zero and f is a ray function, i.e. a positive homogeneous function of degree 1: 


S(Atl, «++, Ar) = A-f(r}, --- , 7°) (for \ = 0); 
f(r', «++ , 7”) > 0 except for (7! --- 77) = (0 --- 0). 


In terms of further arbitrary constants v* the varied surface 2 itself, the “hood” 
is described by: 


ze = ev*{1 — f(r... r)} for f(r! --- 7) 31, 
= 0 outside this region. 
The inequality AJ = 0 with the expression (14) for AJ and with ¢ — 0 leads to: 


[1] M {E(u = v*f(r))} = 0. 

f(r)S1 
fi(r) are the derivatives df/dr‘, M is the integral extending over the domain 
f(r --+ 7") S 1in r-space that should now be looked upon as the affine “tangent 
space”’ of the r-dimensional manifold G in (t); the left side of [1] is invariant in 





GEODESIC FIELDS 619 


this sense. As the f;(r), the components of the normal vector, are homogeneous 
of order zero, the integral may equally well be interpreted as an average over 
the “sphere” of all directions in r-space. 

One can show by specializing the ray function f in an appropriate manner 
that not only the integral [1] but every element of it must be = 0. We choose 
a positive constant k and put 


f(r'r? .-- 7") = max. (|7'|, k|7?|, --- , lr" |) for 2 20, 


= max. (k|7r!|, k|7?|,---, k/7"|) for 7? <0. 


[2] 


Afterwards we let k in [1] tend to zero. The volume of the negative half r' < 0 
of the region f(r) S 1 equals 2’-'/k" whereas the volume of the positive part r' = 0 
equals 2"-!/k"-!._ Let us write for a moment 


(3] (1,0, --- , 0) = (us, Ue, «++ , Ur). 


f.is of order k in the negative half, whereas it differs from u; by quantities of 
the same order in the positive half of our region. Considering the fact that 
E(u$) for arguments uf of the order of magnitude of k is = O(k*) = o(k) one 
finds for the left side of [1] after multiplication by (k/2)’-! an expression 


E,(veu;) + 7 o(k) 


and consequently one arrives with k — 0 at 
[4] E,(v* u;) = 0. 


The particular covariant vector [3] may here be replaced by an arbitrary one. 
The result formerly obtained in a slightly different manner by McShane’ is the 
following 

Necessary local condition for a strong minimum: Unless [4] holds for arbitrary 
values v*, u; at any point (to) of G, the surface X» cannot have the minimizing 
property. 

An immediate consequence is the similar 

Necessary local condition for a weak minimum: The quadratic form F(t | v7) 
must be = 0 for such values of the variables u{ that nullify all the quadratic forms 
ufup — utué. 

In the general case r > 1, v > 1 there yawns a wide gap between the neces- 
sary and sufficient conditions; unfortunately it seems not likely that one will be 
able to set up a more complete set of local necessary conditions that are com- 
parable in simplicity to McShane’s inequalities [4]. 


Part 4. Construction of Geodesic Fields 


$17. Cylindrical domains and fields. For the purpose of the local prob- 
lem G can be assumed to be a cube. We shall solve the problem in the large 





* Annals of Math. 32 (1931), p. 578. 











a (aw eee a! a an + 





ge OT any res 


{tay 
i 
4 
ba} 
; 













618 HERMANN WEYL 


(12) yields the following system of differential equations for the unknown s',: 


dsig 
dt’ 


The transformation character is indicated again by the position of the indices; 
it should be added that the three A’s on the right side are densities of weight 
+1,0, —l respectively. A solution of these Legendre equations furnishes what 
may properly be called an approximate geodesic field. (Legendre’s method as 
he applied it to the second variation would lead exactly to the same result.) 

Whereas Legendre’s condition is only a part of the much stronger Weierstrass 
condition, it is to be guessed that the existence of a geodesic field in the approxi- 
mate sense of the “‘second variation,” implies its existence in the exact sense. 
Our conjecture will be proved in the last Chapter. The result is two-fold: 

1) The embedding of 2» by a geodesic slope field is always locally possible. 
This suffices for answering all questions about local minima (when only sur- 
faces = are admitted to competition that differ from 2 in a small enough 
neighborhood of a point £). 

2) The embedding goes through, even in the large, for the whole extremal 3), 
provided the first approximation, the solution of Legendre’s equations, can be 
effected. 


(29) = Aug + Ate8ia tAre8paS8eg- 





§16. Appendix: Necessary Local Conditions. Let us consider the extremal 
Xo: z2* = O in the neighborhood of a given point ¢’ = ¢) and denote the 
E-function at that point of Zo, namely H(t, 0; 0, uz) by Ho(uf). One gets a 
necessary local condition for a strong minimum by putting a little cone-shaped 
hood on 2». Its basis may be defined by f(r! -- - 7") S 1 in terms of the relative 
codrdinates 7‘: tf = t} + er‘; here ¢ is a positive constant doomed to approach 
zero and f is a ray function, i.e. a positive homogeneous function of degree 1: 


f(aAr', + Ping Ar") = A-f(7', Ores #, 1’) (for r = 0); 
f(r, «++ , 7) > O except for (7! --- 77) = (0 --- 0). 


In terms of further arbitrary constants v* the varied surface > itself, the “hood” 
is described by: 


ze = ev*{1 — f(r! ..- 7*)} for f(r! --- 77) S 1, 
= 0 outside this region. 
The inequality AJ = 0 with the expression (14) for AJ and with « — 0 leads to: 


[1] M {H(uz = o*f(r))} = 0. 
f(r)S1 

fi(r) are the derivatives df/dr‘, M is the integral extending over the domain 

f(r! --- 77) S 1in r-space that should now be looked upon as the affine “tangent 

space”’ of the r-dimensional manifold G in (t); the left side of [1] is invariant in 


GEODESIC FIELDS 619 


this sense. As the f;,(r), the components of the normal vector, are homogeneous 
of order zero, the integral may equally well be interpreted as an average over 
the “sphere” of all directions in 7-space. 

One can show by specializing the ray function f in an appropriate manner 
that not only the integral [1] but every element of it must be = 0. We choose 
a positive constant k and put 


f(r .-- 77) = max. (|7'|, k| 77], --»,k{r"|) for rt > 0, 


2 
. = max. (k|r!|, k|7r?|,---, klr"|) for cr! < 0. 


Afterwards we let k in [1] tend to zero. The volume of the negative half r' < 0 
of the region f(r) S 1 equals 2’-/k" whereas the volume of the positive part r! = 0 
equals 2"-'/k"-!._ Let us write for a moment 


[3] (1,0, --- , 0) = (wu, Ue, +++ , Ur). 


f.is of order k in the negative half, whereas it differs from u; by quantities of 
the same order in the positive half of our region. Considering the fact that 
E(u?) for arguments uf of the order of magnitude of k is = O(k?) = o(k) one 
finds for the left side of [1] after multiplication by (k/2)"-' an expression 


E,(v«u;) + -o(k) 


and consequently one arrives with k — 0 at 
(4] E,(v% u;) => 0. 


The particular covariant vector [3] may here be replaced by an arbitrary one. 
The result formerly obtained in a slightly different manner by McShane’ is the 
following 

Necessary local condition for a strong minimum: Unless [4] holds for arbitrary 
values v*, u; at any point (to) of G, the surface X» cannot have the minimizing 
property. 

An immediate consequence is the similar 

Necessary local condition for a weak minimum: The quadratic form F(t | ut) 
must be = 0 for such values of the variables u{ that nullify all the quadratic forms 
uu, — upué. 

In the general case r > 1, v > 1 there yawns a wide gap between the neces- 
sary and sufficient conditions; unfortunately it seems not likely that one will be 
able to set up a more complete set of local necessary conditions that are com- 


parable in simplicity to McShane’s inequalities [4]. 


Part 4. Construction of Geodesic Fields 


$17. Cylindrical domains and fields. For the purpose of the local prob- 
lem G can be assumed to be a cube. We shall solve the problem in the large 





‘Annals of Math. 32 (1931), p. 578. 

















aaa 


iy ees SEE eo 




















620 HERMANN WEYL 


for cylindrical regions G, i.e. for regions G which are the product of an (r — 1)- 
dimensional manifold G* and the open one-dimensional continuum—such that 
the points P of G appear as pairs (P*, ¢) consisting of an arbitrary point P* of G* 
and an arbitrary number ¢. G* may be referred (locally) to coérdinates 
#2, -.-, and t be used as the coérdinate t. Since the Hamilton-Jacobi equation 
(12)—preferably in its undifferentiated form as stated at the end of §5—is 
invariant under topological transformations, our method yields a solution for 
all manifolds topologically equivalent to a cylinder. The complete intrinsic 
topological characterization of the “cylinders” is not yet known; but we cer- 
tainly get a fairly general picture of the situation in the large even though we 
have to make this restriction of a topological nature. Its necessity shows, 
however, that our mode of approach is not quite adequate. Every “cell,” as 
for instance a convex region in ordinary (¢!, --- , ¢”)-space, is of course a cylinder. 

We start out to construct in our cylindrical manifold G a solution s‘ for which 
all components s’, --- , s” except s' vanish identically. Writing ¢, s instead of 
t! and s!, and dropping the upper index 1 where it appears with a similar mean- 
ing, we reduce (12) to the partial differential equation with only one unknown s: 


ds P _ os zt 
(30) at = H(t, z*, pa), Pa = 22 ( o<t<+o), 
The coérdinates ¢?, --- , 7 play now merely the réle of accessory parameters. 


We have 


(31) H =0, H,. = 9, H,, = 9 for z= 0, p=0 


(ie. forzi=---=2=0,p,=-:-- 
tion s(t, z*) making 


py = 0), and our aim is to find a solu- 


(32) swt, 2 wt ee Gus: 
0z* 
One can get at the partial differential equation (30) with two different tools: 
either with the theory of characteristics, or following Cauchy, by power series 
and their dominants. Let us first go the former way. 


§18. The characteristic equations. The differential equations for the char- 
acteristics of (30) read as follows: 


dz 
“dt ee H,,{t, 2, De) ’ 


(33) 


dpa 
—_ = B 
dt Hat, Zz . Pe) . 





GEODESIC FIELDS 621 


When one is called upon to determine that solution s(t, z*) of (30) which satisfies 
the initial conditions (32) one has to proceed in the following manner. One 


integrates (33): 
(34) e = $*(t;20), De = walt; 26), 
with the initial values 

£2(0; 28) = 26,  ra(0; 28) = 0 
and the further equation 


ds 
dt ep, » Pa Hyg 


by quadrature: 


(35) yu carey [i (*. a) at 


One then must express the initial values z¢ by means of the z* themselves in 
solving the equations 


(36) 2* = $(£; 26), 


and in doing so one changes the quantities 74, (34), and o, (35), into functions 
pa and s of (t, 2*). They satisfy all the relations (30). 

The solution of the ordinary differential equations (33) is possible in the 
neighborhood of ¢ = 0 for sufficiently small initial values zj. Furthermore the 
desired inversion of the functions (36) near t = 0, z* = 0 is possible since the 
functional determinant 
ace 


B 
dz 


(37) equals 1 for ¢=0. 








This remark settles the local question. 


§19. The characteristics in the large. The first step goes through in the large 
too. That is to say: to a finite interval —a < ¢ S a arbitrarily given, one 
may assign a positive constant ¢« such that (33) is solvable throughout that 
whole interval provided all the initial values z{ are of modulus less than e. 
Let us briefly repeat the well-known proof. 

Our differential equations (33) are of the type 
OF wm ft th «+> 2) (¢ = 1,---,n) 
where f(t; 0 .-. 0) = 0. Combined with the initial conditions z; = x? for 
t = 0 one replaces them by the integral equations 


a(t) = 2¢ + icc x(t)) dt 











es se 


SET Cr 
— 


















ETERE valiscao he eers st: 
ao ae eaemennaegeen tha nn apie 


RE, 


ot; s < 
ia I is a es 
~ - oe Li 
x a ae SS eee, 
a nw eee ee SS 


: 
om aca: _—_ 3 





622 HERMANN WEYL 
and determines successive approximations x, x’, x’’, --- recursively accord- 
ing to 
t 
(38) 2VFD(t) = 29 + i S(t; «(t)) dt [x(t) = 2), 
0 


Using the abbreviation | x | for the largest of the n moduli | 2 |, --- , | z, | and 
supposing the functions f; to satisfy the Lipschitz inequality 


If(t;2z) —fty)|SM|ce—y| (—a Sta) 


as long as |x| < A,|y]| S A, one sees from (38) that the sequence of the suc- 
cessive approximations is majorized by the partial sums of the series 


>) iM = «+e! (t = 0) 
h=0 
and that one is allowed to go one step further in this development as long as 
the preceding approximations keep within the range |x| < A. It is supposed 
that the initial values x° satisfy the inequality | z°| < «. The first step is all 
right because the integrand in 


ot — atm [sts 2) a 


can be replaced by the difference f(t; x°) — f(t; 0) of modulus less than «M. 
Hence the whole estimation is legitimate and the approximations converge to 
a solution x for which 


| x(t) | S e-e™'" (—a Sta) 


when ¢ is taken as A-e~™*, 
Notwithstanding the solubility of the characteristic equations (33) thus 

proved, the construction in the large of the embedding geodesic field might fail 

in the second step, because the functional determinant 

age 


aze 


(39) 


for zoe) = 2) = 0 








becomes zero for some value of ¢ (Jacobi’s “conjugate point”). Therefore the 


necessity of requiring Legendre’s equations (29) to have a solution si throughout 
the whole domain G. 


§20. Determination of the geodesic field by means of characteristics. Bui 


this admitted, one is able to overcome the obstacle just mentioned. We split 
off the quadratic part 


83(t*, 2") = : Do 8ag(t*) 2228 
a,B 








7 


GEODESIC FIELDS 623 


as formed by the given solution s},(t*) of Legendre’s equations as our first ap- 
proximation, and thus put 


si(t, z*) si si(t*, z*) + Bi(t*, z*) ; 
H(t, 2%, 83/82" + pi) — Halti, 2%, 8; /de*) = Alt, 2«, pi). 


The equation (12) remains valid for the “corrections’’ § and A: 


os 0s 
até » %, az)’ 


but the situation is improved in so far as the quadratic part A, of A(z, p) con- 
tains no terms 2°28 (only products piz*, pip5). It is material that we start with 
any given solution of Legendre’s equations without introducing the ‘‘cylindrical”’ 
specialization s* = --- = s* = Oforthe si,(t*). The corrections 8, though, shall 
be determined in the cylindrical manner again: & = --- = *¥ = 0. Thus after 
returning to the old notations s, p, H instead of &, p, H all previous relations are 


preserved ; but we have won the further condition: 
(40) Hi» =9 for z=0, p=0. 


We treat the equation (12) with the new Hamiltonian H by the method of 

characteristics again, and now prove the non-vanishing of the determinant (39). 
For this purpose we must consider the derivatives 

oe OT a 


— ae Tap = 
azé 


= — > for Zoe = 2p = 0. 
029 


55 


If C3(t) denotes the second derivative 
Hp, 28 for z = 0, Pp = 0, 


one deduces by differentiating the second line of equations (33) with respect to 
2} and taking into account the fact (40): 


dra 

= f= Cr(t) T p(t) . 

Since tag = 0 for ¢ = 0 this leads at once to the result that zas(t) = 0 for all 
values of t. In view of this situation the first line (33) gives rise to the rela- 
tions 





dee a 
“a = — C%(t) $3(0). 


Hence the determinant A of the ¢% fulfills the simple equation 





41 dA a 
(41) yt elt) A = 0 






















624 HERMANN WEYL 


where c(t) is the trace of the matrix || C(¢) ||. The initial value of A(é) for 
t = Ois 1; hence from (41): 


A(t) = nn, 


This shows that A(é) is positive throughout the whole interval —a S t S aand it 
even gives a fixed positive lower limit: A = e~, c being an upper bound to 
c(t) in that interval. One easily infers now that a certain neighborhood 9, of 
Z) = Oin a z-space is put into one-to-one correspondence with a neighborhood 
MN, of 2 = 0 in z-space by means of the relations (36) for every fixed ¢ in the 
interval —a St Sa. 

Thus one succeeds in building up the correction s* that is to be added to 
Legendre’s approximation s} in order to get an exact geodesic field. 


§21. The method of power series. One can hardly avoid a feeling of discom- 
fort regarding this whole process of solving the Jacobi-Hamilton equation,— 
an equation that served as a tool for the theory of extremals—by means of its 
characteristics—which are something much akin to, but not quite identical with 
the extremals. Furthermore, one ought to understand better why everything 
goes smoothly, once the existence of the first approximation is granted. Any- 
how, I thought it worth while to carry through also the second more direct 
method: application of power series whose convergence has to be secured through 
simple dominant series. Here the reason becomes perspicuous: the subsequent 
approximations depend on linear equations only, whereas Legendre’s equations for 
the first approximation are of the quadratic Riccati type. 

For our present purpose one must assume at the outset that H is analytic 
in z and p and is thus given as a power series in terms of all these variables 
z*and pi. The expansion begins with the quadratic terms Hz only. Starting 
with a given solution s},(t*) of Legendre’s equations, we make use of the same 
trick as in §20, and thus are able to assume H, to contain no products 272°. 
Let us subtract from H the part bilinear in z and p: 


(42) H = Ci(t) paz’ + H*, 


and put the first term on the left side of our equation (12). Our solution s' 
should be a power series in z the terms of which we arrange by increasing 
order: 


si(t,z) = 8; + 8; + os Ss 


! 


silt, 2) = > ar si(ny co? Ny; t)(z!)"4 see (z”)”” 
ileee ! 


‘v 


(m +--+ +n, =n) 





or 


of 
d 


le 


FF Va Ge 





GEODESIC FIELDS 625 


is the totality of all terms of order n. The lowest order occurring is 3. The 
coefficients of the n** approximation s,, have to satisfy equations of the type 


4) iim wi) _ CUA ty A+ netd +. mys 5 


= F(m --- n,;t). 


The s‘ in the second term on the left side contain the same indices n, --- n, as 
the first term if 8 = a; the same holds for 8 ¥ a except that n, is increased and 
ng diminished by 1. The right-hand side becomes a known function after the 
preceding approximations of order lower than n have been computed. This was 
the reason for our shoving over the first part of H in (42) to the left side of 
our equations. 


§22. Solving and majorizing the differential equations for the approxima- 
tions. At this stage we introduce again our assumption of the cylinder-like 
topological nature of G, enabling us to put s? = --- = s’ = 0 and to forget 
about the variables #, --- , t”. (43) are changed into ordinary differential 
equations 


(44) ds(nj a My; t) ee a” ees Gees eT 
F 





= F(n, --- n,;t) 
which we want to solve under the initial conditions 
s(m --- ;t) = 0 for t=0. 


The coefficients C$ (t) are the same as in §20. 

One knows how the solution is effected explicitly by an infinite series. One 
first combines the differential equations with the initial conditions into an 
integral equation 


s(t) = fr: F(t) dt + - C(t) s(t) dt. 


s stands here for all those s(n; --- n,; t) for which m + --- +”, has the pre- 
scribed value n = 3, arranged in a single column; F has the same significance 
while C(t) is the matrix of the linear transformation occurring in (44): 


s(n --- My) — D> CG (t)ng - s(.-- Natl --+ m—1 ose), 
8 


The solving series 
s(t) = s(t) + s(t) + s@() + --- 


is computed by successive integrations according to: 


s(t) = [Fe dr, s**t) = r: C(r) 8(r) dr. 








= ; 626 HERMANN WEYL 


This was mentioned merely for the purpose of deducing from it the majorizing 
property: if 


(45) ICaO™| S730, |F(u--- nj; | S$ Om --- 0,56) 


then the corresponding solution o of the equations with T and ® instead of 
C and F dominates s: 


| s(m --- m3 #) | S o(m--- n,; 4). 
Let us assume in particular that we are in possession of upper bounds 
(46) | C5 C(t) | ST; | F(m oes Ny} t) | < A, - e(n-2) at 


involving certain constants I’, A, A, and valid throughout the interval 0 < ¢ < a. 
It is essential that neither [nor A dependonn. A bound like I can be assigned 
a priori whereas the proper choice of A and A, is to be kept open for later 
decision. With these dominants (46) instead of (45), all the elements of our 
















column o(m --- ”,;¢) become equal: ¢,(¢) and the majorizing system (44) re- 
duces to the simple equation: 
do 
— — nI-o, = An: e"-?)4t 
Fr nl -o A,-e 
with the solution: 
A 
a an (n—2) At __ ont 
a a wice* whe 
If 
1 
3 A-—T=B8B 


is positive, then the denominator (n — 2)A — nI will be =2nB > 0 for n 2 3. 
Thus one is led to the estimation 


F3 


(47) | s(n 223 Ny3 t) | < ° A, e(n—2) At : 
nB 


consequently 


s(t, z*) is dominated by yar & vt ree: 


n=3 


Os 


eines 0z* 


a — A 
d an | yn—l o(n—2) At 
is dominated by 2 7s . 
(g=z2i4...42"). 
§23. Recursive formula for the upper bounds. In order to determine an 
upper bound of the desired form (46) for F(m: --- n,; ¢), we first have to ma- 








;. = SS wee 





GEODESIC FIELDS 627 


jorize the given Hamiltonian H(t, z*, pz). Such a dominant may obviously be 
chosen in the form 


M(z + p)? 


— M2?, 
1—R(z+>p) 





since the products 222° in the quadratic term H, of H are missing. p stands for 
n+ -:: +p, as z stands for z'+ --- + 2’. The factors R and M are con- 
stants valid throughout the whole interval —a S ¢ S a. = 2M is thena 
proper upper bound for the C3(¢), and H* is dominated by 


M(z + p)? 
1 — R(z + p) 





(48) — M(z* + 2zp). 


We now replace p by its dominant as given at the end of the last section; 
that is, by z-f(¢) where 


A, cn? 


20 
=3 


Vv 
(49) =F) 
depends only on the combined argument ¢ = z-e**. The dominant (48) is still 
enlarged when one replaces R in the denominator by R-e4‘; it then takes on the 
form 


M2*(1 + f(s)? , 
be ima. 





The coefficient of z" herein is an upper bound for F(n --- n,; t) provided that 
the inequalities (47) prevail for all orders less than n. Because (50) equals 2? 
times a function of ¢ = z-e4', that upper bound is precisely of the form (46). 
In this way we have arrived at a proof of (47). The factors A, are determined 
by the following recurrent equation for the generating function (49): 





M21 + f(¢))? _ MzX(1 + 2f(¢)) = 2, A, 2” e(n-2at 





1 — R¢(1 + f(S)) kon 
or 
(51 (1 + f)? a oe. 
al) i—Ra +h (1+ #) =f 


§24. The auxiliary quadratic equation. Final conclusions. The recurrent 
computation of the coefficients A, of f(¢) guarantees that they are positive, 
whereas the solution of the quadratic equation (51) for f will show that the 
series (49) is convergent in a circle round the origin. This settles the convergence 
for our successive approximations. 

























Pi 
ihe 
dae 
iBy 
1): 
‘ 4; 
we 
} 3 
; ii 
Dat : 
at aad 
abe Ge | 
oe ‘ ai 
S Fe 
i eT 
Hi 
. 4 « 
1 3 ae ‘ 
Ph edd 
7 Lag sy 
Cae ; 
d % i } } 
Bi; ia 
roa f } | 
tia. ] 
yee x | 
ak aa 
‘ f { 
“ys apt | 
{ } 
14 eee 
ihe} 
‘ 
th 
? 
} 
} } 
H 
i 
an My 
Ni j 
aS 
fei 
Fer y 
iat: 
5 
( 
' 


628 HERMANN WEYL 


But let us be a little more explicit! On putting 
R¢ = U, 1 + f = ¢ 


our equation becomes 


B B 
a - (2+ 5)e- (1 +3%)- 


Hence we choose a constant a > 2, take 6 = a — 1, and consider the equa- 
tion for ¢: 





(ay — B)(1 — gu). 


If 
(52) g =%+ au + auw?+--- 


is its solution with the initial coefficient a9 = 1, then the inequalities (47) will 
hold with 








B = vM(6 — 1), 5A = M[v(6 — 1) + 2], 
A,/B => a,-2k" . 
We find 
(53) o = (a+ Bu) — Va + Bu)? = 481 + au) | 


2(1 + au) 


The square root must be taken with the minus sign at u = 0 in order to have 
the expansion (52) of ¢ start with the term 1. The quadric under the square 
root 


(Bu — a)? — 48 = Bu — us)(u — Ue) 


has two positive roots 
1 —_ 
Uj, Ue = 2 (a + 2/8). 
Cauchy’s integral formula gives the following expression for a,: 


1 g(u) du 
a, = =— 
2m Je ou" 





The integral extends over a small circle k about the origin. The function ¢(w) 
is regular in the complex u-plane to be slit along the line wu, S wu S we. It has 
no pole, since the numerator in (53) vanishes for u = —1/a where 1 + au = 0, 
and it is finite at infinity. For negative real values of u the square root in (53) 


is positive, so that the value of ¢ at infinity equals B/a. Thus K may be re- 





GEODESIC FIELDS 629 


placed, forn 2 1, bya path closely surrounding the incision. One adds together 
in the usual manner the contributions from opposite points on the two borders 
of the slit, and thus arrives at the formula 





B "2S (us — u)(u — uw) 
: ey Ae vd 
(54) ” 2x Jus (1 + au) unt . 21) 





which proves anew the positiveness of a,. In the case n = 0 one has to add to 
the path around the slit an infinitely large circle K whose contribution will be 


6 1 f[du_B6 


a Qilsku a 
Since a = 1 and 1 — (8/a) = 1/a one finds here 


c2 1 8 “2 4/(u, — u)(u — wm) 
(55) a an [ a + au)u mm 


(55) yields the following bound for (54): 


a s}.4 246). 
~ @ ur aia — 278 


Let us put 8B = y?, a = 7? + 1, and replace 7 by A in the final result. We 








then find pa = 08/dz% to be dominated by 


r4 ; - Y . 3 Ait] 7 
1+? > (24) ” ‘ 


n=1 





where 
gadtpe. te, A= Mo(y?- 1) 42. 


The number y > 1 may be chosen at random, whereas M and R are fixed by 
the nature of the Hamiltonian H(t‘, z*, p:) and the solution s,.,(¢) of Legendre’s 
equations. A reasonable choice for y would be y = 2. 


Tue INsTITUTE FOR ADVANCED Stupy, 
PriIncETON, NEw JERSEY. 
































ANNALS OF MATHEMATICS 
Vol. 36, No. 3, July, 1935 


THE INCONSISTENCY OF CERTAIN FORMAL LOGICS 
By S. C. KiEEnrE anv J. B. Rosser 
(Received November 13, 1934) 


1. Prospectus. We discuss here some formal logics which are inconsistent 
in the sense that every formula in their notation is provable, irrespective of its 
meaning under the interpretation intended for the symbols. This results from 
the presence of a form of the Richard paradox, which we deduce, by utilizing 
representations of the logics within themselves, somewhat as follows: 

A theory of positive integers x is constructed in the formal logic under con- 
sideration, such that the proposition x 7s a positive integer is expressible as a 
formula N(x), and the proposition Q is a positive integral function of positive 
integers or whenever N(x), then N(Q(x)) asa formula F(Q). F(P) is proved fora 
given P. Utilizing this theory, we select a representation of the formulas A of 
the logic by other formulas a of the same logic of such nature that, in general, 
the functions of the a’s which correspond to metamathematical notions relating 
to the form of the A’s are definable in the logic.' We require also a formula G 
which transforms each a into the logical product of the corresponding A and a 
provable formula T, i.e. such that G(a) = A-and-T is provable. Then a 
formula § is found which enumerates the representations of the provable 
formulas, i.e. such that for each a which represents a provable A there is at least 
one positive integer x such that (x) = a is provable, and conversely for each 
positive integer x there is an a which represents a provable A such that $(x) =a 
is provable. A formula expressing whenever N(x), then G(G(x)) is proved. A 
formula L is found such that L(a) = a is provable if A is of the form F(Q), and 
L(a) = c, where c is the representation of F(P), is provable otherwise. Using 
G, a formula is found such that W(a) = Q is provable if a is the representation 
of FQ). Let U(r) = Wi(L(H(z))). Then U enumerates the Q’s such that 
F(Q) is provable. Using whenever N(x), then G(S(x)), we prove whenever N(x), 
then F(U(x)). Thence, letting Q(x) = 1 + {U(x)}(x), we prove F(Q). Hence 
Q is in the enumeration of Q’s; but, formally, its z'* value exceeds the x” 
value of the x** Q by 1.? 

In order that a G exist, the logic must permit great freedom in the expression 
of its formulas as values of functions. Logics have been proposed which have 





1 This type of representation was first used by K. Gédel in “Uber formal unentscheidbare 
Sdtze der Principia Mathematica und verwandter Systeme I,’’ Monatshefte fir Math. u. 
Physik, Vol. 38 (1931), pp. 173-198. 

* Some of the considerations entering into this argument are discussed intuitively by A. 
Church in “‘The Richard paradoz,’’ Am. Math. Monthly, Vol. 41 (1934), pp. 356-361. 


630 











INCONSISTENCY OF CERTAIN FORMAL LOGICS 631 


this freedom, e.g. a combinatory logic of Curry* and a system of Church.* Both 
of these systems are inconsistent. Indeed, given the functional notation, 
together with the postulates governing it, of either,’ a few additional properties 
suffice for the proof of inconsistency along the foregoing lines. 

In the remaining sections we consider these results in greater detail. 


2. Logics with the undefined terms { }(), A[], I, 2, &, and variables. We 
define well-formed formula, free and bound symbol, and variable as in Kleene 
1934 §1, and use the abbreviations of Church and Kleene (except as otherwise 
provided), and the convention that letters in heavy-type denote well-formed 
expressions. 

TueoreM A. If in a given logic Church’s Rules of Procedure I-V® are rules of 
procedure or valid methods of proof, and Church’s Formal Postulates 1, 3-11, 
14-16’ are provable formulas, then in that logic every well-formed formula with 
no free variables is provable.® 

This theorem follows from Kleene 1935 thus? Let F — Af-N(x) D, N(f(z)) 
andP— §. Then F(P) is provable in C,,' by conversion from 3.2. Choose U 





3H. B. Curry, “Grundlagen der kombinatorischen Logik,’ Am. Jour. of Math., Vol. 52 
(1930), pp. 509-536, 789-834; ‘‘Some additions to the theory of combinators,’’ ibid., Vol. 54 
(1982), pp. 551-558; “‘The universal quantifier in combinatory logic,’’ Annals of Math., Vol. 32 
(1981), pp. 154-180; ‘‘Apparent variables from the standpoint of combinatory logic,’’ ibid., 
Vol. 34 (1933), pp. 381-404; ‘‘Some properties of equality and implication in combinatory 
logic,” ibid., Vol. 35 (1934), pp. 849-860. These papers will be cited by author and year. 

The system of Curry which we are considering is that one whose postulates are the axioms 
and rules introduced in these five papers; our remarks do not apply, for example, to the 
system obtained by assuming only the axioms and rules given in the first four. 

‘A. Church, ‘‘A set of postulates for the foundation of logic,’’ Annals of Math., Vol. 33 
(1932), pp. 346-366, and Vol. 34 (1933), pp. 839-864. The theory of these postulates is 
further developed by S. C. Kleene in ‘‘Proof by cases in formal logic,’’ Annals of Math., Vol. 
35 (1934), pp. 529-544, and ‘‘A theory of positive integers in formal logic,’’ Am. Jour. of Math., 
Vol. 57 (1935), pp. 153-173, 219-244. We shall cite these papers by author and date. 

*Or combinatory notation and postulates equivalent to the functional notation and 
postulates of the latter, as given by J. B. Rosser in ‘‘A formal logic without variables,”’ 
Annals of Math., Vol. 36 (1935), pp. 127-150, and Duke Math. Jour., Vol. 1 (1935), No. 3. 

* By Church’s Rules of Procedure I-V we shall mean the rules of Church 1932, p. 355-6 as 
revised in Kleene 1934 §1. 

’ Church 1932 p. 356, or 1933 p. 841. 

* By valid method of proof we mean a rule the addition of which to the rules of procedure 
does not increase the class of provable formulas. (The terms logic, rule of procedure and 
provable may be understood as in Kleene 1934 p. 529.) 

1, 3-11 can be replaced in this theorem by a set of formulas equivalent (when the rules of 
procedure are I-V, and 14-16 are axioms, and the interpretations of Kleene 1934 §1 are 
employed) to Church’s Theorem J (1932, p. 358). 

If the logic has axioms containing ~ (‘‘not’’) as a free symbol (such as Church’s 17-27), 
then both P and ~ P are provable for every P having no free variables. 

’ We give this proof, although the theorem is a consequence of Thm. C, because the 
proof of the latter will be given only in outline and by comparison with this proof. 

*° C, denotes the logic whose rules of procedure and formal postulates are Church’s I-V 
and 1, 3-11, 14-16, respectively. In accordance with the program stated at Kleene 1934 p. 
530, “provable” at this point means “provable in C:.” 











eee 
ape 
ce 
{ me 
ey te 
et ae sta | 
oa ees 
hie Meet i 
ag ee 
a a / 


oR 
Se Slbadigeint aie aay ghar 9 —rmee, Seater 





aire Sic Rar 
Peyote re 


Tench seem cam ss 
3 ie we SENET 


SES. 


\ 

















632 S. C. KLEENE AND J. B. ROSSER 


in accordance with 19XIII. By 19XIII(2), N(m) >,F(U(n)), and, by con- 
version, (1) N(n) D,»-N(x) D.N(U(n, x)). Thence, using 3.1, 5.2 and Thm. I, 
N(n) D,N(1 + U(n, n)), and by conversion F(Q), where Q — dn-1 + U(n, n). 
Thus F(Q) is provable in C;. By 19XIII(1), there is a positive integer ¢ such 
that (2) U(q) conv Q. N(q) is provable from 3.1 by g — 1 successive applica- 
tions of 3.2. Using (1), weinfer (3) N(U(q, q)). Now 1 = [1 + U(q,q)] — U(q, q) 


(11.2, 3.1, (3)), conv [1 + Q(q)] — U(q, q) (by (2)), conv [1 + -1 + U(q, q)] - 
U(q, q) (using the def. of Q), = 2 (conversion, 11.2, 3.1, 3.2, (3)). Hence, 
by §2,1 = 2. The conclusion follows by Kleene 1934 10I. 


3. Logics with the undefined terms { }(), \ [ |, II, &, and variables. We 
now outline modifications of the preceding proof by which the use of 2 may be 
dispensed with. 

Redefine E to be Ax-x = x, and abbreviate E(x) >.M to ‘x-M. 

TueoremM B. II(dg-g(Afz-f(x)), Ag-g(Afxz-f(f(x)))) ts a provable formula in 
the logic whose rules of procedure and formal postulates are the following (BI-BVII 
and B1-B19, resp.): 


BI-BIII. Church’s Rules of Procedure I-III. 

BIV. If F(A), then I(F, F). 

BV. If F(A)-II(F, G), then G(A). 

BVI. If PQ, then P. 

BVII. If P and Q, then PQ. 

Bl. p Dy: pg Dag. 

B2. ‘a-f(a) Dy; -[ f(x) Dz g(x)] Dy -[g(x) Dz h(x)] Dr -f(x) Dz h(a). 

B3. ‘a-‘b-g(a, b) D, -g(x,b) D, E(x)E(b). 

B4. ‘a-f(a) Dy; - [f(x) Dz E(x) E(b)] Ds - (f(x) Dz g(a, b)] D, -f(x) Dz M(g(z), 
g(x)). 

B5. ‘a- E(b(a)) De -g(a, b(a)) Dy -g(x, b(x)) Dz E(x) E(0(z)). 

B6. ‘a-f(a) Dy - (f(z) Dz E(x) E(b(x))] Ds - [f(x) Dz g(a, b(x))] Dy -f(e) rz 
II(g(x), g(x). 

B7. ‘a-f(a) Dy -‘b-g(b) D, - [f(x) Dz g(b)M, h(x))] Dn -f(a) Dz A(z, 6). 

B8. ‘a-f(a) Dy -q Dy -f(x) Dz f(x)q. 

BY. ‘a-f(a) Dy - [f(z) Dz H(b(x)) M(B, E)] Ds - (f(z) Dz g(b(x))MG, 9)] > 
- [f(z) Dz g(b(x))M1(g, h)] Pr -f(x) Dz h(O(z)). 

B10. ‘a-f(a) D,; - [f(x) Dz g(x)] D, - [f(x) Dz h(x)] Dr -f(x) Dz g(x)h(z). 





INCONSISTENCY OF CERTAIN FORMAL LOGICS 633 


Bll. ‘a-E(b(a)) Ds -g(b(a)) >, -g(b(x)) >. Mg, Ay- E(x) E(y)). 

B12. ‘a-f(a) Dy - (f(z) Dz E(b(z)) ME, dy- E(z)E(y))] De - (f(x) Dz g(b(x)) 
1(g, \y-E(x)E(y))] Do f(x) Dz g(b(z)) 1, h(x))] Da- f(z) Dz h(x, b(2)). 

B13. ‘a-f(a) >, - [f(x) Dz E(x) E(b)MAy- E(x) E(y), \y- E(x) E(y))] Ds - [f(2) 
D,.g(x, b)M(g(x), g(z))] Dy - (f(z) Dz g(x, b)M(g(x), A(x))] Dx -f(x) DP. 
h(z, b). 

B14. ‘a-E(b(a)) Ds -g(a, b(a)) Dy, -g(z, b(x)) Dz M(g(x), EZ). 
B15. ‘a-f(a) Dy - [f(z) PD. E(x) E(b(z))MAy- E(x) Ely), E)] Ds - (f(z) Dz g(z, 
b(x)) (g(x), Z)] Dy - [¥(@e) Dz g(x, b(x))M(g(x), h)] Dr -f(x) Dz h(b(z)). 
B16. ‘a-f(a)>y - [f(z) PD. E(x) E(b(z)) My: E(x) Ey), dy- E(x) E(y))] De -[f(x) 
D, g(x, b(x))M(g(x), g(x))] D, - [f(w) Dz g(a, b(x))M(g(x), h(x))]>, -f(x) 
D, h(x, b(z)). 

B17. ‘a-f(a) Dy; +g Aq -f(z)q Pz f(z). 

B18. ‘a-f(a) Dy -g(a) Dy -f(x)g(x) Dz f(z). 

B19. ‘a-f(a) Dy -p Dp -f(z) Dz -p-f(z). 


In proving this theorem, we shall use the following: 

Weak Form or Cuurcn’s TuororeM I. Jf the variable x does not occur in 
F, Gor M asa free variable, and if G(x) is provable as a consequence of F(x) and M, 
then F(x) D, G(x) is provable as a consequence of F(A) and M." 

This is proved by induction on the number of applications of BIV-BVII in 
the proof of G(x) from F(x), M and the axioms (ef. the proof of Thm. I, Church 
1932, pp. 358ff). For illustration we give one of the cases: Suppose the last 
application is the inference by BV of h(b) from g(b)II(g, h), where x is a free 
symbol of b and h, but not of g. Then, using the hypothesis of the induction, 
F(x) >, g(b’(x))I(g, h’(x)) where b’ — \x-b and h’ > dx-h. Using F(A), we 
obtain g(b’(A)), and thence, by BIV, g(b’(x)) >, g(b’(x)) or g(b’(x)) Dx {Axy- 
y(b’(x))}(x, g). Using E(A) and E(g) (both obtained by use of BIV) and 
\Axy-y(b’(x))}(A, g) in B3, we obtain {Axy-y(b’(x))}(x, g) Dx H(x)E(g) or 
g(b’(x)) >, E(x)E(g). The last two results with B4 yield g(b’(x)) Dx II(Ay- 
y(b’(x)), Ay-y(b’(x))) or g(b’(x)) Dx, E(b’(x)). Now using Bll, E(b’(x)) 
>: II(E, \y- E(x) E(y)); hence, by B2, g(b’(x)) >, II(E, Ay- E(x) E(y)); and, by 





" “Cis provable as a consequence of D,, Ds, --- ”’ (or “Di, Dz, --- | C’’) shall mean that 
Cis derivable from D,, Dz, --- and the axioms of the logic under consideration at the time 
by means of its rules of procedure. 

In practice, either the M is superfluous, or it stands for the logical product of previous 
“assumptions” made in preparation for further use of this and like theorems. 























ona 





y 





634 S. C. KLEENE AND J. B. ROSSER 


B10, g(b’(x)) Ds E(b'(x))I(E, dy-E(x)E(y)); also, by B18, g(b’(x))I(g, h’(x)) 
>, g(b’(x)); and, by B2, F(x) >; g(b’(x)); and hence, by B2, F(x) >, E(b’(x)) 
T(E, d»y-E(x)E(y)). Similarly, F(x) >. g(b’(x))M(g, Ay-E(x)E(y)). Then 
using B12, F(x) >, h’(x, b’(x)), and, by conversion, F(x) D,G(x). 

We now parallel as closely as possible the theory given in Kleene 1935. 

Whenever Kleene proved 2(F) and used Thm. I, he got the 2(F) by proving 
F(A) (except at certain points in the proof of 17.1), so the weak form of Thm. I 
can be used instead. However, without the 2, we have no general method of 
stating implications on several variables. In some cases this causes no trouble. 
Thus we could just as well write 19.20 as ad(b)E(G(b)) > -ad(a)E(G(a, 
G(b))) D. -G(a, G(b)) = G(<a, b>). With this in mind, we parallel§§1-13. 

However the material given in Kleene 1934 §8 (on which 1934 9I and 1934 
10II depend) is not amenable to this sort of treatment, so we must find different 
proofs for the theorems proved by case arguments. Whenever “examples” 
can be found under both cases, we can use Thm. I and the definition of M instead. 
(More exactly, suppose that the case hypotheses B = 1 and B = 2 contain a free 
variable x, that R is the logical product of all the other assumptions in which 
x occurs as a free variable, and that {Ax-R-B = 1}(X:) and {Ax-R-B = 2}(X,) 
can be proved. Then, by the case arguments and Thm. I, [R-B = 1] D,E(x)C 
and [R-B = 2] D,E(x)C. Bv the def. of M and Thm. I, M(w) >, -[R-B = w] 
D,E(x)C. C follows from M(B), R, and this formula.) Otherwise, we avoid 
the use of cases by more considerable changes in the proof, or modify the 
theorem in such a manner that examples become available. 

To prove 14.14b, we use the lemma M(a) >, -M(b) >, B(4—-a + b, a, b), 
where $(1) conv \ab- M(a)M(b) and 8(2) conv d\ab-[a = 1] [b = 1]. 

The last clause of 15IV holds whenever a T can be found such that | T(A) 
and T(X) } T(F(X)) for arbitrary X. (For then N(x) >, -T(L(x))-L(S(x)) = 
F(L(x)) is provable by induction.) The last clause of 15V holds whenever a T 
can be found such that + T(A) and N(Y¥)T(X) | T(F(Y, X)) for arbitrary X 
and Y. (Forthen \x-T(x(1)) is a “T” for the application of 15IV in the proof of 
15V.) 15IV and 15V as thus qualified suffice, since T’s can be given for all the 
applications which we require, as follows: §16, S: Ax-N(a(J, 1)). §17, 8: 
dx -[x(Aa-I*(1)) = Am-m(2, x(dra-I2(1), Apgr-I7(I*(q))), 1] [N(x(da-I*(1), 
Apgr-I?(I"(q))))]; L: N; F: \x-T(x(1)) (ef. 171. §19, 1, redefined so that 
t(S(k)) conv Am-m(Apgr-t(k, g, I, I, r(k, r, I, I, p))) (k = 1, 2,---): 
Az-x(I, I, I, I) = I; e and e’: \x-ad(z([1])) (using [1], = [12 = [1]; 9: 
Ax-a([1], [1]) = 2; g: Ax-x((1]) = I; i: ad. The remainder of §15 (after 15V) 
is not used. 

We replace 17.1 by [N(¢) >;-r(é) > 1] D, -N(p) D,-[x < S(p) A 
y < S(r(z)) D>, EF, y))] Dy -[t < S(p) Dz -y < S(r(x)) D, t(f(z, y))] Pe 


z< s( > “()) >. t(Q(f, r, z)), and change 17.2 similarly. Then we have 
i=1 


an example under Case 2 of (i), and prove E (c)-G,(1, a) = 1| . 








INCONSISTENCY OF CERTAIN FORMAL LOGICS 635 


¢,(1, S(c)). This with the lemma N(r) >, -[¢(1) -[N(c)¢(c)-¢ < S(r)] D, 
$(S(c))] Dy o(S(r)) gives an example for Case 1. Under (ii), we define a, — 
- (2% (o)) — zand b, > ne-{ nal + r(a)] — p 3s rio} (a,(z)), 


w=1 i=1 i=l] 


and prove N(p) Dp-2 < (5 ) >: {nab < S(p)-b < S(r(a))-z = 


Pick 6 ra) Vale 6,(2)). 


i=l 

§18 is not used. 

After reaching 19.21, certain changes are required to make the discussion 
apply to the new list of rules and axioms. We read II& in place of IZ&. The 
old Rey — Reg we may regard as having been obtained by subdividing IV 
(If F(P), then 2(F)) and V (If F(P) and II(f, G), then G(P)) into 2° + 2° rules 
R, (t = 39, --- , 614) by distinguishing cases according to the occurrence or 
non-occurrence of II, 2, and & in F, P, and G as free symbols, and then construct- 
ing the R, from the R;, according to a certain general pattern. We now sub- 
divide BIV-BVII into 24 + 2° + 2¢ + 2trules R, (¢ = 39,---, 150) with 
respect to the free occurrences of II and &, and construct corresponding rules 
R, in a precisely analogous manner. %%,-%13; are replaced by combinations 
%.-Y%is representative of B1—B19; etc. 

Furthermore, we let C, denote the logic whose axioms are B1—B19 and whose 
rules of procedure are BI-BIII and a subset R., -R,, of R3o—Ri 50; define 


739 
Y =v(v = 1, --- ,38);and replace C; by C,, R, and ®, (¢ = 1, --- ,614) by R, | 
and Rt, (wu = 1, --+ , py), resp., mand n by m, and n,, resp., § by §,, and U by 
U,. Then, if we assume concerning y that all the ce, (w = 39, ---, py) are 


used in given proofs, we can finish paralleling §19, obtaining in conclusion 19XIII 
with C; and U replaced by C, and U,, resp. (The assumption is used to provide 
examples for the cases in the proof of 19.23(u)). 

Finally, we parallel the proof given in §2 of the present paper. First take 
R, -R, * to be the cases of BIV-BVII which are used in the proof of 3.2. Then 


139 
the assumption on y under which we just proved 19XIII is satisfied, and also 
F(P) is provable in C,. Hence, using 19XIII(2), we obtain a proof of F(Q,), 
where Q, — \n-1 + U,(n, n). Now if this proof uses any of R;-Riso not in 
the list RR, (i.e. if it is not a proof in C,), we add them to the list, and 
repeat the argument with the new y thus obtained; and so on. Eventually we 
obtain a y such that F(Q,) is proved in C,. Then using 19XIII(1), we can 
complete the proof of 1 =2. By conversion, I(Ag-g(Afz-f(x)), Ag-g(Afz-f(f(x)))). 

Turorem C. If in a given logic CI-CVI of the list which follows are rules of 
procedure or valid methods of proof, and C1 and C2 are provable formulas, then in 
that logic every well-formed formula with no free variables is provable: 


CI-CIII. Church’s Rules of Procedure I-III. 
CIV. If PQ, then P. 











‘ i 636 S. C. KLEENE AND J. B. ROSSER 


CV. If F(A) and F(x) >, G(x), where x is a voriable not occurring in F or G, 
then G(A). 


CVI. The weak form of Church’s Theorem I. 
Cl. pDp +4 Pa PF. 
C2. p Dp -pq >, 9. 


Proof. Let Il’ — Afg-f(x) D, g(x). Then if we replace II by II’ in the logic of 
Thm. B, its rules of procedure become valid methods of proof, and its formal 
postulates become provable formulas in the given logic. Hence 


I’Qag-g(afz-f(x)), Ag-gAfx-f(f(z)))) 


is provable in the same. By conversion, 1 = 2. The conclusion follows by 
Kleene 1934 10I (which holds good for the given system). 


4. Combinatory logics. Rosser has shown that the functional notation and 
rules of Church can be cast in combinatorial form. Consequently Thms. A-C 
have combinatorial equivalents.!? 

We turn now to the above-mentioned system of Curry. 

THeorEM D. If ina given logic Curry’s Rules E-P*® are rules of procedure or 
valid methods of proof, and Curry’s Axioms Q-I., (ILB)—(IIP), Mo, (PB)-(PK)" 
are provable formulas, then in that logic every entity 1s provable. 

iii Proof. Define \y[Y] to be [y]Y, {F'}(A) to be (FA), I to be [f, g](x) (fz D gz), 
BS 4: and & to be [p, a] (x) ((f) (fK > ({(CI) > fx)) D> xp(Kq)).* Then it can be 
me shown that BI-BVII hold and B1-B19 are provable. Hence, by Thm. B, 
+ HQg-gAfz-f(z)), dg-gAfz-S(f(z)))). + G) GI > g(WB)) follows com- 
binatorially. Hence, letting T be any entity such that | 7 and F be any 
entity, | ([7]2CKFT)I > ([z]zCKFT)(WB). | ((z]2zCKFT)I follows com- 
binatorially from + J. Hence | ((z]zCKFT)(WB), from which | F follows 
combinatorially. 



















PRINCETON UNIVERSITY, PRINCETON, N. J. 





12 To obtain such equivalents, we need only to utilize the definitions of J, J and com- 
bination, and Thm. 6V, of Kleene 1934 §6, and the property of the thirty-eight rules Ri-Rs 
of Rosser, loc. cit., Section H that if A and B are combinations, and A conv B, then A is 
derivable from B by Ri-Ris. 

8 Curry 1930 p. 522. 

4 Curry 1930 p. 521, 1931 p. 169, 1933 p. 399, 1934 p. 850. 

® We use entity here in Curry’s sense (cf. 1930 I C and 1931 p. 157). In particular, every 
combination of Curry’s B, C, W, K, Q, 0, P, Ais an entity. 

16 Here Y, F and A represent combinations, and y represents a variable ocourring in Y. 

To facilitate the statement of the relation between the present system and that of Thm. 
B, we suppose Curry’s II replaced by some other symbol, and (without invalidating Thm. 
B) restrict the bound symbols of well-formed expressions to variables in Curry’s sense. 





st = 





AnnaLs OF MaTHEMATICS 
Vol. 36, No. 3, July, 1935 


LACUNARY RECURRENCE FORMULAS FOR THE NUMBERS OF 
BERNOULLI AND EULER 


By D. H. Leumer 
(Received February 24, 1934) 


Recurrence relations for the computation of the numbers of Bernoulli have 
been the subject of a great many papers. Nevertheless, only two extensive 
calculations have been carried out. Adams! has calculated the first 62 (non- 
zero) Bernoulli numbers, while Cerébrenikoff? has given the first 92. Both these 
intrepid calculators used recurrence formulas of the most primitive sort, in spite 
of the fact that several formulas had already been given, which would have 
saved them many hundreds of hours. 

It is customary to give recurrences whose coefficients are neatly expressed in 
terms of familiar functions, sometimes at the expense of considerable labor in 
calculating their actual value. In the recurrence relations in this paper the 
coefficients are designed for ease of calculation at the expense of compactness of 
expression. 

The reader may question the utility of tabulating more than 92 Bernoulli 
numbers and hence the need of giving formulas for extending their calculation. 
It is true that for ordinary purposes of analysis, for example in the asymptotic 
series of the Euler Maclaurin summation formula, a dozen Bernoulli numbers are 
sufficient. There are other problems, however, which depend upon more subtle 
properties of the Bernoulli numbers, such as their divisibility by a given prime. 
Examples of such problems are the second case of Fermat’s Last Theorem and 
the Riemann Zeta-function hypothesis. Our knowledge as to the divisibility 
properties of the Bernoulli numbers is still quite primitive and it would be highly 
desirable to add more to it, even if the knowledge thus gained be purely empirical. 

The method which we use applies not only to Bernoulli numbers B, but with 
equal ease to the numbers Z, R, and G of Euler, Lucas and Genocchi.* These 
four sets of numbers may thus be considered together to make a symmetrical 
theory. Moreover R and G may be simply expressed in terms of B so that we 
thus obtain three sets of recurrences for Bernoulli numbers. The coefficients 
of the power series for trigonometric functions of higher order may also be dealt 
with by the same method as we indicate briefly in §4. 

In contrast to the recurrences usually given for Bernoulli numbers, in which 





‘ Journal fiir Math. 85 (1878), 269-272. Collected Papers, v. 1. 

* Akademiia Nauk. Math. Phys. KI: Mémoires. Ser. 8. v. 16, no. 10 (1905), and v. 19, 
no. 4 (1906). 

* See Lucas: Théorie des N ombres, Paris, 1890, chapters 13 and especially 14. For elabo- 
rations of Lucas’ theory of Bernoulli and allied numbers see the following papers by E. T. 
Bell: Trans. Amer. Math. Soc. 24 (1922), 89-112; 28 (1926), 129-148; 31 (1929), 405-421. 


637 




























638 D. H. LEHMER 


B,, is made to depend upon all‘ the preceding (non-zero) B’s the recurrences 
given in this paper have gaps so that B, can be computed from those preceding 
B’s whose subscripts are congruent to n with respect to a modulus m the length 
of each gap. Recurrences with arbitrarily large gaps seem to have been con- 
sidered first by van den Berg.’ Twelve years later the same results (in less 
explicit form, from the point of view of application) were obtained by R. Hauss- 
ner,® who expressed the coefficients of the recurrences with large gaps in terms of 
the hypergeometric function. Both van den Berg and Haussner based their 
method on the power series expansion of the product of sin wz, where w runs 
over the n* roots of unity (a product suggested by Kronecker’). Both treat- 
ments, especially that of van den Berg, are unnecessarily long and complicated. 
A more straightforward discussion in finite terms has been given by Nielsen,’ 
who gives only the simpler results, however. Mention should be made of the 
recurrences given by Ramanujan? for small gaps. In the present paper we 
obtain the recurrences of van den Berg and Haussner in a form for practical 
application, as a part of a more general discussion by a natural finite method, 
simpler than that of Nielsen. 


1. The present development is based on a certain sum” o,(p, q, r, 8, t), which 
is defined for positive integral values of p, q, Tr, s, and for non-negative" values 
of n and ¢. 


(1) on = anlp, gq, 7, 8,t) = >> (mm --+ mea) + me + mee? +--+ + ne)" 


where « = e?™/?" and each 7, takes on the values 1, 7, 7?, --- 7?*-! where 7 = 
e?™!/Pq, and the sum extends over all (pq)*! possible combinations of these values. 
For s = r we have the following 

THEOREM A.  Gkpg—ri(P, 9, ", 7, t) = 0 for every k for which kq is not a multiple 
of r. 

Proof. The (pq)"- terms of on(p, q, 7, 7, t) may be grouped in two different 
ways into pq sets of (pq)? terms each, by assigning definite values to either 
m Or 7-1. We have in this way two partitions of o as follows: 


pq-—l pqa—l 4 
(2) c= D> S= dS; 
v=0 v=0 





‘ Recurrences for B, involving only B, for n/2 < v < n have been given by Stern: Journal 
fiir Math. 84, 216; Radike, ibid 89, 259; Saalschiitz, Vorlesungen iiber die Bernoullischen 
Zahlen (Berlin 1893), p. 30 and Lucas loc. cit p. 240. 

> “Over Periodieke Terugloopende Betrekkingen tusschen de Coefficienten in de Ontwikkeling 
van Functien,” Verslagen en Mededeelingen der Koninklijke Akademie van Wetenschap- 
pen. Amsterdam (1881), 16, 74-176. 

6 Géttinger Nachrichten (1893), 777-809. 

7 Journal de Math. (2), 1 (1856), 385. 

§ Traité élementaire des nombres de Bernoulli, Paris 1922, 195-225. 

* Journal of the Indian Math. Soc. 3, 1911, 219-234. Collected Papers, p. 1-14. 

10 This sum is a natural generalization of the two sums of Kronecker, loc. cit. 

1 For n = 0, ¢, may involve 0°. This is taken as unity. 





LACUNARY RECURRENCE FORMULAS 639 


where in S,, m is fixed at ’, and in S’, n,_: is fixed at 7’. We now consider 
eS, = > (mm +++ mean”) '(n’e” + € + me + --- + nse)" 
= grttnat) D2 (mime «++ mea) + Me + mn ee 
eee bh gn atrer—t)e 


since &" = 7%. If now we introduce into the first parenthesis the factor 7~"— ¢*”?, 
this parenthesis will contain the product of the »’s in the second parenthesis. 
Carrying out the summation indicated, we obtain precisely S,,-(¢4,). Hence 


’ 
eS, en gt rnleby I eto g tots) 


(3) 
dik GOs tess) : 


Setting n = kpq — rt we observe that n + rt is a multiple of pq, so that the power 
of 7 in (3) is not a function of v, and, in fact, is 7 = e*. Summing (3) fory = 
0,1,2,--- , pg — 1 we thus obtain in view of (2), for n = kpq — rt, 


ef Pt—rtg, ort = COR art « 
That is 
((e?)*¢ — 1) onpg—re = 0. 
Since ¢? is a primitive r* root of unity, the theorem follows. 


2. The Numbers B,, G,, R,, and E,. These four sets of numbers may be 
defined by 


B=1, B=—-1/2, (B4+1)*—B = (n>1), 
4) G = 0, G, =1, (G+1)"+G =0 (n>1), 
Ro = 1/2, Rk, =0, (R+1)"—(R—1)"=0, (n>1), 
EK=1, K=0, (E+1)"+(H-—1)*=0, (n>1), 


in which, after expansion, the exponents are degraded to subscripts. Equiva- 
lent definitions in terms of generating functions are 











z 
eF? — “aE or cos 2Bz = z cot z 
é am 
22 
e = a or cos 2Gz = 2z tanz 
. € 
(5) 
Rz ze" QR: 2 
ef = — i or cos 2Rz = z ese 2z 
é a= 
2e? 
ef? = or cos Ez = sec z. 


e* + 1 

























640 D. H. LEHMER 


The numbers B,, R,, Gn, and E, vanish when n is odd with the exceptions 
B, = —1/2,G, = 1. Wesubjoin a small table of the values of these numbers 
when 7 is even. 


n B,, Gi, R,, E,, 

2 +1/6 —1 —1/6 ve | 

4 — 1/30 +1 +7/30 +5 

6 41/42 om —31/42 —61 

8 —1/30 +17 +127/30 +1385 
10 +5/66 —155 — 2555/66 — 50521 


If we multiply each of the equations (4) by an ("") xz™—" and sum over n from 0 


to m, (taking into special consideration the cases n = 0, 1), and then sum over 
m, we get the fundamental relations 


(6.1) f(B+2+1) -f(B+2) =f'@), 

(6.2) fG+e+1)+fG@+x) = 2a), 
(6.3) fR+2+)-fR+2-) =f), 
(6.4) fE+2ex+1)+fE+2e—-1) =F), 


where f(z) = ao + x + aor? + .--- 
If in (6.3) and (6.4) we set f(z) = x", while in (6.1) and (6.2) we set f(z) = 
(2a — 1)", and replace 2z by z in these inten results, we get 


(2B + (1 + x))” — (2B — (1 — z)) = 2n(z — 1)", 
(2G + (1 + x))" + (2G — (1 — 2))" = 4n(@ — 1)" 
(R+(1+2))"— (R—- (1 —2))” = nar 
(E+ (L+2))"+ (EZ — (1 — 2)" = 22". 


Before expanding the binomials in (7) according to increasing powers of the 
umbral letters B, G, R, and E, we recall that all odd powers vanish, except 
B' andG'. The terms arising from these symbols we transpose. After expand- 
ing we may let x range over values 2; to be determined later, and then sum over 
these values. We thus obtain 


[n/2] 
p> 2” Bo, ts) p> (1 + xj)" —_— eS 1)*(1 ak 2;)*-* 
(8.1) °=° j 


=n >) (1+) — (- Id — a) 





LACUNARY RECURRENCE FORMULAS 641 


[n/2] 
a 2s, (3) 2 (1 + 2)" + (— 1) — 2) 


(8.2) °"" 
= —2n >) (1 +2) + (-1)1 — 2) 
[n/2] . 
(8.3) i Re, (3) 2 (1 + 2)" — (—1)"(1 — 2)” =n b> zn! 
{n/2] 
(8.4) y Ey (3) 2 (1+ a)" 4 (—1)(1 — 2)” =2 b rz. 


We now seek to determine the z; with a view to applying Theorem A. It is 
simplest to consider first the equations (8.2) and (8.4). In these equations we 
make n even, say n = 2m. In the definition of o,(p, q, r, r, t) set pg = 2 (that 
ism = —1). Wenow observe that if the set z; coincides with the set 


e+ me + nse + --- + m-1€™, 
then 


i (1 + z;)" + (1 — z;)* = o,(p; q, Tr, r, 0) 

(9) i 

> 22” = et imlrrg,.(p, g, r, r — 1, 0) 
7 


where p = 1,g = 2orp =2,q =1. Substituting (9) into (8.2) and (8.4) and 
using Theorem A with t = 0, we have at once the general lacunary recurrences 
for the numbers of Genocchi and Euler. First with p = 2, q = 1 we have 


[m/r] 
(10.2) be 22" 2? Gom—2hr ie) on,(2, ie 1,17, 0) =— 4moom—1(2, 1, rT, 7, 0) 
rear! 2dr 





[m/r]} 
2 , 
sian 2, ee a) on,r(2, R rr, 7; 0) = Zerrim/r50,,(2, 1, had 1, 0) ° 


If ris odd, o,(2, 1, r, 7,0) = on(1, 2, r, 7, 0). Hence (10.2) and (10.4) remain 
unaltered if r is odd and 2, 1 is replaced by 1,2. Forr = 2h however, different 
formulas are obtained from Theorem A, when p = 1, and g = 2. These are 


[m/h] 
2 
(11.2) yo 22m—20h Go 20h () T2rh (1, 2, 2h, 2h, 0) = — 4 AO 2m—1 (1, 2, 2h, 2h, 0) yl 


A=0 


[m/h] 
(11.4) 2, Eoam—2xr bet o,(1, 2, 2h, 2h, 0) = Qerrimsh Fm (1, 2, 2h, 2h _ 1, 0) ° 
A=0 

















642 D. H. LEHMER 


We now return to the numbers B and R. If we rewrite (8.1) and (8.3), sub- 
stituting y; for z; and if we multiply the equations in y; by (—1)* and add them 
to the equations in z;, we obtain at once 


[n/2] 


2 2” Bo, (3) 2 (1 + 2,)"-” -— (1 a yj)" — (—1)*(1 — a,)"-» 


(12.1) 
+(-D1 + yr =n D) L$ x) = (1 y) 
— (—1)(1 — a) + (-D(1 +)" 
[n/2] . 
pa Ro, (3) ye (1 + 2;)"- Pe} (1 J y;)"- ai (—1)*(1 us 2j)"-* 
(12.3) ”" j 


+ (-1)1 + yr =n Dd) att + (-1)ry. 
i 
Again we set pg = 2 in (1), and choose n in (12.1) and (12.3) of the same 
parity asrin (1). Finally we choose for z; the set of 2”-* quantities 
mn et el + Hil +.) SEQ where 23 +++ m—1 = +1, 
and for y; the 2’-* quantities 
y= —e+ me+ne@+--- + m1€- where 293 --- ma = (—1). 
With these choices it is seen that 
X (1 + 2) — (1 — y)* — (—1)*(1 — 2)" + (—1)901. + ¥)* = op, GD) 
and 


De att + (-1)yFt = erie vierg, (p,q, 7, 7 — 1, 1). 


7 
Applying Theorem A and writing n — r = 2m, we have for p = 2, q = 1 


[m/r} 


2m+r 
2m—2rr PR. 
(13.1) 2 . Brrr i 4 ") Onr4r(2, 1, 7, 7, 1) 


aaa (2m + 1) O2m4r—1(2, 1, r,’, 1), 


[m/r] 
> = eal 
Rom— r r+r 2, 1, ? i 
(13.3) — 2m—2r eae o2rr4r( r,? ) 


= (2m + retiamtr—-l)/t go, . (2, lrr- 1, 1). 





LACUNARY RECURRENCE FORMULAS 643 


If r is odd we may again interchange p and q without altering (13.1) or (13.3). 
For r = 2h, however, Theorem A gives us 


[m/h] 
2 2h 
(14 1) pe 22m" Bom—20h faeky “+ 7rn42n(1, 2, 2h, 2h, 1) 
' rA=0 


= 2(m -- h)oom+2r—-1(1, 2, 2h, 2h, 1) 


(14 3) Reom—2nh be 2 _ F2rn42n(1, 2, 2h, 2h, 1) 
‘ =0 


ne eri(2m—l)/h2(m — h)oom+2n-1(1, 2, 2h, 2h — l, 1) . 


3. Explicit recurrences. In order to obtain practical recurrences we have 
only to evaluate or to give some effective method for calculating the various 
o’s that appear in equations 10 to 14. For small values of r or h this is simple 
enough. But it is very desirable to have large gaps, which means large values 
of rand h. The labor of calculating the o’s increases very rapidly, so that a 
practical limit to the size of the gaps is soon reached. The most practical 
recurrences have gaps of 12. We subjoin a few explicit recurrences which may 
be derived from equations 10 to 14, by evaluating the o’s and simplifying. 

From (10) and (13) we obtain for r = 2 the following recurrences with gaps 
of 4: 


SS 2m +2 [F] m 41 
bP (—1)* 2"? Boa es 4 ;) = (-1)"° — 
h=0 
[m/2] m+2 
y 2 (—1)* 2” Gon—an (=) = (yl : ] 2m, 
h=0 
- Mat 2m + 2 m+1 
2s (—1)* 2 Rom—an ts j 7 = (—1)" or 
[m/2] om 
» (= 1929 Bawa (2) = (—1)". 


h=0 
In order to obtain recurrences with gaps of 4 from equations (11) and (14) 
we set h = 2, and get 
[m/2] 


2 Bom—4 ets) ((—1)* 2414 1) = at. (nl Qt) 4 ) 


[m/2] m 
2Gom + Zz Gom—4r (=) ((—1) Q2r—-1 + 1) ae -m((9 an re ) 






































644 D. H. LEHMER 


[m/2] 
Dy Ramey 7 4) BCP FD = Om + Mw $Y 


(m/2] ’ 
2Em + >) Eom ey 20((—1) 21 41) =o +1, 
A=1 
where cy = 1l,c, = —41, andc, = —6c,_, —25c,_, 
co =1,¢, = —3 , and c. = —6c,_, —25c,_¢. 


It is evident that these last recurrences are more ‘complicated than (15). 
The same is true and to a greater extent for larger gaps. We shall not give 
space to further examples of equations (11) and (14). _ 

Recurrences with gaps of 6 are obtained from (10) and (13) putting r = 3, 
and are as follows: 








9 
[m/3] 2. if m = 3k — 1. 
>) Bs aor rs) = ‘ 
one met I + otherwise 
3 : 
bes 33° a (2”) (2m ifm = 3k —1. 
m. + mM _ 
’ ened om \ 6D |— 4m otherwise . 
[m/3] 
2m+3\,,,  2m+3 wiiena 
D) Boma (Gt 3) = m3 (Lamas +1). 
[m/3] 2 1 
Eom + 3 te Eom—sr (7) 26 = 2 ((—1)™3™ + 1) ° 
A=1 


For r > 3, on(2, 1,7, s, t) can no longer be exp ‘essed explicitly without intro- 
ducing powers of irrationalities.* The practical method of evaluating these 
expressions is to resort to linear recurring series. Hence in the recurrences with 
larger gaps, the o’s are given as recurring series, which, after all, are nearly as 
easy to work with as powers of an integer. This for r = 4, the following 
recurring series are used. 


n On On fn Is gn In 
0 —1 
1 —5 
2 —l 


















12 Thus for g, we have —2gn = (5 + +/—2) (1 — /—8)* 1 + (6 — V—2) (1+ -8)"": 





















3 —4 10 — 239 —99 43 — 23 
4 —34 14 1393 577 95 17 
5 —29 —12 —8119 — 3363 —197 241 
6 —41 —99 47321 19601 — 1249 329 
7 140 -, 338 — 275807 — 114243 —725 —1511 
a and a’ satisfy the recurrence U, = —34U,-4 — Un-s 
f and f’ satisfy the recurrence U, = —6 Uni —Un-o 
e g and g’ satisfy the recurrence U, = 2 Uni —9Un-2. 
7 With these definitions we have the following recurrences with gaps of 8: 
m , m+1 m 
a abet gel, = ln 42) ates 
X=0 
ee 
X=0 
SR am + 4) cote! as = — (m 42 
Ls 2m—8) & 4. 4 A442 = m + 2)(fmar + Gms) 
> Eom—s (= 2 = Sn + Gm 
A=0 
The recurrences with gaps of 12 are given explicitly in terms of the following 8 
recurring series: 
n Bn B,, n Un Un Wn 
0 1 “9 0 2 2 2 
; 1 5 5 1 —6 2 10 
2 26 a 2 10 ~70 90 
3 97 —26 3 306 — 1078 1234 
4 265 — 265 4 — 4846 — 2446 18290 
5 362 — 1351 
6 —1351 — 5042 n us v, Wr 
7 — 13775 — 13775 0 —1 1 0 
8 — 70226 — 18817 1 11 1 12 
9 — 262087 70226 2 —101 —179 200 





LACUNARY RECURRENCE FORMULAS 


645 








646 D. H. LEHMER 


10 — 716035 716035 3 559 — 1259 2660 
11 — 978122 3650401 4 119 7129 = 35472 
B and #’ satisfy Un = —2702Un-6 —Un-12 
uand u’ satisfy Un = —12 Un» —62 Un_-2 +36 Un_s —169 U,_, 
v and v’ satisfy U, 4 Un 1 —78 Un-2 —428U,_-3 —1369U,_, 
wandw’ satisfy Un = 20 Un+1 —110Un-2 +356U,-3 —-25 U,_4. 


We have then the following recurrences 


{m/6] 
2 6 
2s Bom—12r poled ams + (— 1)'2+*) 


, 


a... a : (Guns + (— LElgmss ~ (= 1) *3) if m = 2(mod 3) 
(15a) 





— 


[m/6] 


8 Gam + 3 be Gm—120 (my. + Ber + (—1)28-1) 


h=1 
= 6m (1 + Bus + (LF), 


where @ is equal to 1 or —2 according as m = 2 (mod 3) or not. 


[m/6] 

2 6 . 

3 > Rom—12r ee :) ZIA+5 (Bare + (— 1)26+2) 
h=0 


= (—1)"""(m + 3)(Uns2 + Unt2 — Wmt2) « 
[m/6] 


SEom + 3 be Eom—12 feng 2-11 + Be + (— 1)428-) 
h=1 


= (— 1)"(8" + (— 1)" + Un + Um + Un) - 


Recurrences with gaps of 14 or more are not so practical as those given above. 
We have worked out those with gaps of 16 (the next simplest case after 12), 
but these involve 8 order recurring series whose scales of relation have very 
large coefficients exceeding 10" in most cases. 

Replacing G, and R,, by 


G, = 2(1 — 2")B, 
R, = (1 — 2™)B, 


in the above recurrences, we obtain further formulas for B,. 








LACUNARY RECURRENCE FORMULAS 647 


In connection with the calculation of B,, it is well to point out that Adams’ 
method of using the von Staudt-Clausen theorem to eliminate fractions, can be 
adapted to any recurrence for B, whatever, having integer coefficients. In 
fact, we make the substitution” 


(16) Bon = An aad > 1/p 


where the sum extends to all primes p for which p — 1 divides 2n. Then A, is 
an integer and may be considered as the unknown part of B,. Now if we sub- 
stitute (16) in any recurrence, we may combine and transpose to the right all 
terms arising from 1/p for each p involved. In this way it is clear that we obtain 
a series of fractions whose denominators are distinct primes and whose sum is an 
integer. Hence these fractions are actually integers’ and the recurrences 
reduce to operations with whole numbers. 

To illustrate this process let us consider the computation of Bigs using (15a). 
For brevity we write 


(—1)*Baye + (—1)28*? = gq, 


co = 30, c: = 70482, cp = 189767010, --- 
Setting m = 98 in (15a) we obtain at once 


16 
2 (— 1)Bioe—i2 (afte .) Oo = — > (Bio — 2 + 3). 


Writing 


202 
* i. s) a= Th 


and using (16) we have 


16 
7 101 
(17) (— 1)A i=. - — — 2043 1 TA— 1)", 
a )A gsr Tr G (Bi00 +8) + 2) Ve 2 
where p = 2,3, 5, --- are primes and v are those solutions of the congruence 
(18) 196 — 12y = 0 (mod p — 1) 


for which 0 S$ vy S$ 16. From (18) we see that p S 197 and that if p > 3, p is 
of the form 62 + 5. It is easy to verify that Bioo — 2 + 3 is a multiple of 6. 





‘8 Adams (loc. cit.) wrote the equivalent of J, = (—1)""! (A, — 1) so that J, = 0 fory $6. 
For lacunary recurrences this artifice is of little use. 

14 This fact leads at once to interesting congruences modulo p involving the coefficients 
of the recurrence. 



























648 D. H. LEHMER 


Hence each of the terms 1/p = T,(—1)’ is an integer. In fact, the actual terms 
are as follows: 


1/p TA—1)’ = = T,(— 1)” + (— Tis + Ts — T3)/11 
30 


— (4-715 + Tu + T7 + T3)/17 — To/23 + (Tu — Tz + Tr) /29 
— (T13 + T3)/41 — 11/47 + T2/53 — T,/89 + T3/101 — T;/113 
— 7T;/137 + 7,/149 + 72/173 + T)/197. 


The fractional part of Biss by (16) is 183883/171390 Calculating Ags from 
(17) with the aid of the tables of Adams and Cerébrenikoff, and subtracting the 
above fractional part, we obtain the following value for Bigg: 


Denominator 

—62753 13511 04611 93672 55310 66998 93713 60315 | 
30541 53311 89530 55906 39107 01782 46402 41378 
48048 46255 54578 57614 21158 35788 96086 55345 
32214 56098 29255 49798 68376 27052 31316 61171 
66687 49347 22145 80056 71217 06735 79434 16524 
98443 87718 31113 


Numerator 


171390. 





/ 


This is the first instance of an isolated entry in the table of Bernoulli numbers. 





4. Application of Theorem A to other sets of numbers. The four sequences 
of numbers B, G, R, E are special cases, for h = 2, of a set of h? sequences of 
rational numbers which are the coefficients of the power series developments 
of the h reciprocals and the h(h — 1) ratios of the h functions of Olivier™ 


( ) 2” gvth gv t2h 
of) = it Gehl’ Opal 
The detailed account of these h? sequences will appear elsewhere. We illustrate 
here merely the application of Theorem A to one of the 9 sequences associated 


with h = 3. This sequence of integers (the counterpart of E) we designate by 
W and define as follows: 





by teh s «+k 2). 


Wo = 1, (W + 1)"+ (W +)" + (W + o?)” = 0, 
where w = e?"*/3, Tt follows as in (6.4) that 


(199) fW+rt+lt+fW+erto) +fW 424 o) = 3f(2). 





8 Journ. fiir Math. 2 (1827), 243-251. 





\y 


LACUNARY RECURRENCE FORMULAS 649 


Setting « = 0 and f(t) = e* we have as the counterpart of (5.4), 
3 W, 1 
= é aS -———< . 
e+ em +o" go(2) 


from which we see that W, is zero if n is not a multiple of 3. The first few non- 
zero values of W are 


W.=1, We=—1, We= 19, We = —1518, Wis = 315528. 





Next set f(t) = ¢* in (19) and let x range over asetz;. Expanding and summing 
we obtain as in (8.4) 


k 
> Wan(3) Dy +a + bund + + ote) = 8D) at 


In order to apply Theorem A we let z; run through the values 


z= e+ me? + me+--- + nue, 
where each n, = 1,,w*. We have then the general lacunary recurrence 


[k/r] 
3k 
b> W si—sar SS) oan(p, q, 7,7, 0) a= de® tilpr os(D, g,r,r —1, 0) ’ 


A4=0 
where pg = 3. Setting r = 2 we have the following example with p = 3, q = 1: 


[k/2) 3k 
Ws. a 2 pa Ws-6r (3) (—1) 3-1 — (—1)*. 


A=1 


PRINCETON, New JERSEY. 











ANNALS OF MATHEMATICS 
Vol. 36, No. 3, July, 1935 





MECHANICAL MODELS OF SPACES WITH POSITIVE-DEFINITE 
LINE-ELEMENT 


By J. L. Syne 
(Received April 23, 1934) 














f 1. Introduction. Consider a dynamical system with N degrees of free- 

ee dom, holonomic and without moving constraints. The totality of configurations 
constitute a space Vy, the manifold of configurations or configuration-space, whose 
points are the configurations of the dynamical system, and whose topology is 
therefore determined by the dynamical system. If 2x‘ are generalised coér- 
dinates, the kinetic energy is of the form 










(1.1) T=}a,;¢2, (a3 = aj), 





where the dot indicates differentiation with respect to the time, and repeated 
suffixes are to be summed over the range 1 to N. We may assign to Vy the 
kinematical line-element' 


(1.2) ds? = 2Tdt? = aj; dz‘ dz. 







This quadratic form is positive-definite. Thus a dynamical system provides us 
with a Riemannian space, defined topologically and metrically. The interest of 
this representation is two-fold. In the first place, it gives a geometrical inter- 
pretation to the dynamical problem; secondly, it gives a dynamical interpreta- 
tion to multidimensional spaces or to spaces with a topology and metric which 
cannot be realised in a Euclidean space. It is not, of course, implied that it is 
always possible to find a dynamical system corresponding to an arbitrarily 
assigned space. 

















2. Mechanical interpretation of geometrical orthogonality. The covariant 
components of generalised force X; acting on the system are defined by the 
equation 


(2.1) X,dzi = dW, 


where dW is the work done in the infinitesimal displacement dz‘. Raising the 
subscript in the usual manner, the contravariant components are 


(2.2) Xi = ai X,. 





1 For a systematic development of this geometrical aspect of dynamics, see Phil. Trans. 
Roy. Soc. (A) 226 (1926), 31-106. 


650 








\w 





MECHANICAL MODELS OF SPACES 651 


Now any contravariant vector defines a direction in Vy: in particular X‘ defines a 
direction dz':dz?: --- : dx” by means of the equations 


(2.3) dz‘ = 0X‘, 


where @ is an undetermined infinitesimal. The lines of the congruence in Vy 
defined by these equations may be called the lines of force. If the system is 
under the influence of a conservative force system, so that 


(2.4) X; = — dV/az', 


the lines of force are orthogonal trajectories to the (V — 1)-spaces V = const., 
the equipotential surfaces. 

It should be noticed that the geometrical representation of a dynamical 
system does not depend on the forces acting on it. The forces are only intro- 
duced in a supplementary way. 

Lei us now seek a mechanical interpretation of the lines of force. The equa- 
tions of motion of the system may be written 

d?x' t | dxi dx* 
If the system starts from rest at a certain configuration P at ¢ = 0, the principal 
parts of the increments in the codrdinates, when expanded in power series in ¢, 
are proportional to the initial values of d’x‘/dé?, and therefore proportional to the 
initial values of X‘. Thus when a system starts from rest under the influence of a 
force-system, the initial direction of motion (or the initial displacement) in Vy 
coincides with the direction of the line of force. 

The condition for the orthogonality of two directions in V y, defined by X‘, Y* is 


(2.6) a; X‘'Yi=0. 


In order to avoid confusion between orthogonality in the configuration-space Vy, 
defined by (2.6), and orthogonality in the ordinary or physical space in which 
the dynamical system moves, we shall refer to orthogonality in Vy as geometrical 
orthogonality and to orthogonality in the physical space as physical orthogonality. 

We seek a mechanical test for the geometrical orthogonality of two infini- 
tesimal displacements D,, Dz of a dynamical system. Now, by (2.1), we see 
that the work done by the force X‘ in a displacement dz‘ is zero if, and only if, 
the displacement is orthogonal to the line of force. Therefore an infinitesimal 
displacement D, is geometrically orthogonal to an infinitesimal displacement D» 
if, and only if, no work is done in the displacement D, by a force-system which 
produces an initial motion in the direction of Ds. 

The test for orthogonality may also be stated in terms of impulses. If the 
components X; of force tend to infinity and their period of operation tends to 
zero, We obtain the covariant components of the generalized impulse in the form 


te 
(2.7) Z.= lim | ‘Xd. 


tert, Jey 





652 J. L. SYNGE 

The abrupt change in velocity caused by the application of this impulse is given 
by 

(2.8) A(aT/dz*) = Z;, 

or 


(2.9) Avi = aiZ; = Zi, 


where v' = <' , the generalized velocity of the system. If, then, the system is in- 
itially at rest, the velocity generated by the impulse is 


(2.10) vi = Zi, 


and hence the direction of initial motion is that of the impulse. Hence an 
infinitesimal displacement D, is geometrically orthogonal to an infinitesimal dis- 
placement Dz if, and only if, no impulsive work is done in the displacement D, 
by an impulse which produces an initial motion in the direction of Dz. The 
impulsive work here mentioned is given by Z dz‘, and is to be computed in the 
ordinary manner, finite forces being replaced by impulsive forces. 

An advantage of the preceding tests for geometrical orthogonality is that they 
are framed in a manner independent of the codrdinate system. A codrdinate 
system may introduce an apparent singularity or indeterminacy which is not 
intrinsic to the system, as, for example, in the case of spherical polar codérdi- 
nates on a sphere or the Eulerian angles in the case of a rigid body turning 
about a fixed point. Our test tells us at once, in the latter case, that infinitesimal 
rotations about the principal axes of inertia are geometrically orthogonal dis- 
placements. 

Let us now consider more generally the question of the geometrical orthogonal- 
ity of two infinitesimal displacements of a rigid body with a fixed point. Let 
A, B, C be the principal moments of inertia. Any infinitesimal displacement is 
an infinitesimal rotation about some line through the fixed point, and this rotation 
is specified by its three components 6;, 62, 83 along the principal axes, the direc- 
tion-cosines of the axis of the rotation being of course proportional to these 
quantities. Now if the body, at rest, receives an impulsive couple with com- 
ponents G;, Gz, G; along the principal axes, it begins to move with an angular 
velocity having components w, ws, w3, where 


(2.11) Aa = G1, Bue = G2, Cw3 = G;. 


The condition that zero impulsive work be done in an infinitesimal rotation 
6;, 02, 03 is that the axis of this rotation should be physically orthogonal to the 
vector representing the impulsive couple: the condition for this is 


(2.12) Gidi + Gob. + G30; = 0. 


Substituting for the G’s from (2.11), and substituting 0;, 05, 03 for w1, @2, 
the former set representing the components of an infinitesimal rotation having 








1 


( 


- a ss 





MECHANICAL MODELS OF SPACES 653 


the same axis of rotation as the latter, we see that the condition that infinitesimal 
, / , ° 
rotations (61, 82, 93) and (0@,, 0,, 8;) be geometrically orthogonal is 


(2.13) A606; + BO.0, + C:0, = 0, 


or, in words, the condition for the geometrical orthogonality of infinitesimal rotations 
is that their axes should be conjugate with respect to the momental ellipsoid. 


3. Some simple systems. Consider a system consisting of a dise which 
can turn about an axis perpendicular to its plane, the centre being at the same 
time free to move along a straight line in the plane. If @ denotes the angle of 
rotation, and z the displacement of the centre, we have 


(3.1) T = 4 me? + 3 16, 


where m is the mass and J the moment of inertia. Hence, putting x = m! z, 
y = I* 6, we have 


(3.2) ds? = dx? + dy’. 


The configuration-space is therefore a flat space of two dimensions. It has the 
connectivity of a cylinder, since an increase in y of amount 27 J+ restores the 
configuration. 

We may represent the configuration-space by a cylinder embedded in Euclid- 
ean space of three dimensions. The trajectories under no forces are always the 
geodesics of the configuration-space. They are therefore represented by helices 
on the cylinder. 

Consider a system consisting of two discs, rotating independently. We have 


(3.3) T=316,+41263, 
and if we put x = I} 6, y = I} 62, we have 
(3.4) ds? = dx? + dy’. 


Thus we have here a flat space of two dimensions, which is easily seen to have the 
connectivity of an anchor ring, formed by joining the opposite edges of a rec- 
tangle. This dynamical system gives us a concrete realisation of such a mani- 
fold, simpler perhaps than that proposed by Killing.” 

Consider a system consisting of a bead carried on a straight wire which turns 
round an axis which intersects it perpendicularly. If m is the mass of the bead 
and J the moment of inertia of the wire, and if @ is the angle of rotation of the 
wire and r the distance of the bead from the axis, we have 


(3.5) T=11¢ + 4m(? + 76), 
and, putting z = mir, we have 
(3.6) ds? = dz? + (I + 2*)d@. 


aaa OE ae kee) 


? Cf. E. Cartan, Legons sur la géométrie des espaces de Riemann, p. 63. 


















654 J. L. SYNGE 


The Gaussian curvature of the two-space is 
Pa 2 
(I + 2x?)! dx? 

I 


~ oe 
Hence we have an example of a V2 having the connectivity of a cylinder, with 
negative variable curvature, tending to zero at an infinite distance. 

Consider a system consisting of a circular wire, carrying a bead, the wire 
being free to rotate about a diameter. This gives a V2 with the connectivity of 
an anchor ring. If J is the moment of inertia of the wire, m the mass of the 
bead, and c the radius of the wire, and if ¢ is the angular displacement of the 
wire and 6 that of the bead measured from the axis of rotation, we have 


T = 4146? + 4 me (8? + sin? 6 4”), 
ds? = mc’d@ + (I + mc? sin? 6) d¢*. 


(3.7) K= (I + 2*), 


(3.8) 


The Gaussian curvature is 


(3.9) K = — 4 [mce? (I + mce?sin?6)]|-} £ {me (I + me? sin? 6)}- 


d : 
= (I + me? sin? o| 





_ sin*@ + 2k sin? 6 —k 


p= 2 
me (k + sin? 6)?’ siiaendmall 


For all configurations in which the bead is near the axis of rotation, the curva- 
ture is negative, but there is a band on V2 for which K is positive, namely, for 
those values of @ for which 


(3.10) sin? @> (kh? + k)i—k. 


4. The spherical top. Consider a spherical top, that is, a rigid body with 
a fixed point O, the momental ellipsoid with respect to the fixed point being a 
sphere. Using the usual Eulerian angles 6, ¢, y, the kinetic energy is 


(4.1) T = 316 + ¢? +P + 24y cos 6), 


where J is the moment of inertia with respect to any axis through 0. The 
corresponding line-element is 


(4.2) ds? = I(d@ + d¢® + dy? + 2dddy cos 6). 


To investigate the curvature of this V3, it is unnecessary to have recourse to 
direct calculation. Given the body in a certain configuration, it is obviously 
impossible, without reference to external bodies or to properties not dynamically 








A= 
or 


ly 
ly 





MECHANICAL MODELS OF SPACES 655 


essential, to have it displaced in one way rather than another. Geometrically 
speaking, all directions at a point in V; are equivalent. Now if the Riemannian 
curvature at a point for an elementary V» depended on the particular V2 chosen, 
this curvature would depend on the normal to the V2, and would possess maxi- 
mum and minimum values. As an immediate consequence it would be possi- 
ble to distinguish intrinsically between two directions in V;. But this is false. 
Hence V; is isotropic at a point with respect to Riemannian curvature, and hence, 
by Schur’s Theorem, V; is of constant curvature, say K. To find K, it is not 
necessary to have recourse to calculation. We know that a rotation about a 
fixed axis is a motion under no forces, and hence is represented by a geodesic in 
V;. Let us take the rotation in which @, ¢ remain fixed. The length of the 
corresponding closed geodesic is 


2a 
(4.3) 8s= [ Tidy = 2rI' ; 
0 


every geodesic is, by symmetry, a closed curve of this length. 

Let us now take a certain initial configuration, and rotate first about one 
arbitrary axis through a complete revolution, and then about another through a 
complete revolution. It is easy to see that the two corresponding sequences of 
configurations have no configuration in common except the initial configuration. 
Geometrically expressed, two geodesics emanating from a point meet again at 
that point, but not before. Moreover, any two closed geodesics are reconcilable, 
as may be seen on making the corresponding axes of rotation tend to coincidence. 
The constant curvature K is therefore positive, for in a space of constant negative 
curvature and positive-definite line-element two adjacent geodesics emanating 
from a point cannot meet again and be reconcilable.* Hence K is positive, and 
adjacent geodesics emanating from a point first meet at a distance 


(4.4) s = rk, 


Comparing (4.4) with (4.3), we see that the manifold of configurations of a spheri- 
cal top is a three-space of constant positive curvature 


(4.5) K=1/(41). 


It is a space of the polar type, because the geodesics from a point do not meet 
again until they pass through that point.‘ 





* This follows from equation (4.24) of the paper On the deviation of geodesics ... , Annals 
of Math., 35 (1984), 705-713. 

‘ As has been pointed out by a referee, the manifold of configurations of the top is also 
the manifold of the group of rotations in three-dimensional space, and may be treated by 
the methods appropriate to that point of view: see E. Cartan, La géométrie des groupes 
simples, Annali di Matematica, 4 (1927), 209-256. 














656 J. L. SYNGE 


5. A rigid body with a fixed point. The configuration-space for a rigid 
body with a fixed point and unequal moments of inertia is homeomorphic 
with that of the spherical top, but the metric is less simple. It is no longer a 
space of constant curvature. It is clear however that since rotation about a 
principal axis is a possible motion under no forces, there pass through each point 
of the configuration-space V3 three orthogonal closed geodesics of lengths 2A}, 
2r Bt, 2xC}, where A, B, C are the principal moments of inertia. 

It is evident that while the configuration-space of the spherical top may be 
described as homogeneous and isotropic, that of the rigid body with three un- 
equal moments of inertia is homogeneous but anisotropic. At each point there 
exists a principal triad of directions, corresponding to rotations about principal 
axes of inertia. 


UNIVERSITY OF TORONTO. 





AnnaLs OF MATHEMATICS 
Vol. 36, No. 3, July, 1935 


THE ELECTRON WAVE EQUATION IN DE-SITTER SPACE 


By P. A. M. Dirac 
(Received April 16, 1935) 


1. Introduction. The equations of atomic physics are usually formulated 
in terms of the space-time of the special theory of relativity. They then have 
to form a scheme which remains invariant under all transformations which 
carry the space-time over into itself. These transformations consist of the 
Lorentz rotations about a point combined with arbitrary translations, and form 
agroup. It is of interest to examine the effect of the various transformations 
on the physical equations and so to establish a connexion between physics and 
the mathematical theory of groups. 

Since the time when Einstein’s general theory of relativity first appeared, 
various more general spaces have been proposed. Each of these would necessi- 
tate some modifications in the scheme of equations of atomic physics. The 
effects of these modifications on the laws of atomic physics would be much too 
small to be of any practical interest, and would therefore be, at most, of mathe- 
matical interest. 

Nearly all of the more general spaces have only trivial groups of operations 
which carry the spaces over into themselves, so they spoil the connexion be- 
tween physics and group theory. There is one exception, however, namely the 
de-Sitter space (with no local gravitational fields). This space is associated 
with a very interesting group, and so the study of the equations of atomic 
physics in this space is of special interest, from a mathematical point of view. 


2. De-Sitter Space. De-Sitter space can be pictured as the surface of a 
four-dimensional “sphere” (of a hyperbolic character in one direction), em- 
bedded in a five-dimensional space. It may be described conveniently! by five 
codrdinates 21, X2, 3, 24, 25 connected by the relation 


(1) t+apt+ay—ai+a5 =P, 


R being the radius of the sphere. To see the connexion between this space 
and the ordinary space-time of special relativity, we must restrict ourselves to 
a small region, say the region for which 2;, 22, 23 and 24 are small, in compari- 
son with R, and 2; is equal to R, to the first order. The codrdinates 2, £2, £3, 24 
then become the z, y, z, t of ordinary space-time. The de-Sitter space goes 
over into ordinary space-time when FR tends to infinity. 





See H. P. Robertson, Phil. Mag. V, p. 839 (1928). 
657 


(a> ne aeepgneys 








: 
' 



























658 P. A. M. DIRAC 


We should get a rather similar space if we made the five codrdinates satisfy 
(2) t+ % +t; — 2-25 = —P’, 


instead of (1). The main difference would be that, whereas (1) gives a space 
which is infinite in the time direction and finite in spatial directions, (2) gives 
a space which is finite in the time direction and infinite in spatial directions. 
Most of our work will apply equally well to either space. We can take this 
into account by working with five codrdinates satisfying the symmetrical 
equation 


(3) rit+apt+at+eita = 


and supposing x, to be pure imaginary when we want to have space (1), and 
x4, 25 and R to be pure imaginary when we want to have space (2). We shall 
write (3) in the contracted form 


(4) t,t, = R?, 


the suffix » running from 1 to 5. 

Any set of values for the x’s which does not satisfy (4) will determine a 
point outside the de-Sitter space, which means a point outside the space of 
physics. Thus a physical function of position, such as a field quantity or a 
wave function of the quantum theory, will be a function of the z’s which is 
defined only for values of the z’s satisfying (4). Hence it will in general have 
no meaning to differentiate a physical function with respect to one of the 2’s. 
The only processes of differentiation which it will have a meaning to apply toa 
physical function will be those referring to differentiations along directions in 
the de-Sitter space. The operators expressing such differentiations will be of 
the form 


=. 
rm aa,’ 


(5) 
where the a, are functions of the codrdinates x, of the point where the differ- 
entiation is performed, satisfying 

(6) ply = 0. 


These operators may be characterized by the condition that they commute 
with the left-hand side of (4). The most fundamental of them are 


0 0 
7 ee ee 
@) * an, we 





corresponding to the infinitesimal rotations of the de-Sitter space. 

It is sometimes convenient to extend the domain of definition of a physical 
function x so as to be able to give a meaning to its being operated on by 0/02,. 
We can do this by assuming x to be expressed as a homogeneous function of 
the x, of some arbitrarily chosen degree n. If the values of x at all points on 





ELECTRON WAVE EQUATION 659 


the sphere (4) are given and also the degree n is given, then the values of x at 
points outside the sphere are fixed and 0x/dz, acquires a meaning. With this 
meaning we shall often have to use Euler’s theorem 


(8) Ty — = nx. 


We can change the degree of x without altering its value on the sphere (4) by 
multiplying it by some power of z,2z,/R*. 

An operator (5) for which (6) holds commutes with the operator on the 
left-hand side of (8), if the a, are homogeneous of the first degree [as they are 
when the operator (5) is of the form (7)]. To prove this result, we note that 


OTS aR. Somes <b ea a 
"02, "8% ” O%_ O2, * az, 
and 

a rm 2 wht te a oe 
Or, OX, OX, OX, OX, OX, 

064 re) 

= 7,a,— — +a, — 

* ax, d2, 7% az,’ 


by an application of (8) with x = a, and = 1. Thus we can make free use 
of equation (8) when dealing with operators of the form (7). 

A vector in de-Sitter space will be a thing A, having five components. The 
condition that the vector shall represent some physical quantity will usually 
require that its direction shall lie in the de-Sitter space and thus that 


(9) LyAy = 0, 


zt, being the point where the vector is localized. In this way the number of 
independent components of A, is reduced to four, in agreement with the num- 
ber of components of an ordinary physical vector. If the coérdinates x, are 
0,0, 0,0, R, then As = 0 and Aj, As, As, Ag are the ordinary space-time com- 
ponents. 

Any tensor equation of ordinary space-time must correspond to a de-Sitter 
equation which is a tensor equation with respect to the five suffices 1, 2 --- 5. 
The condition that the original equation must be invariant under Lorentz rota- 
tions and translations in the space-time corresponds to the condition that the 
new equation must be invariant under all rotations applied to the five codrdi- 
nates 21,22 --+ 2s. Weshall illustrate this by obtaining the de-Sitter analogues 
of some of the important equations of physics. 


3. Elementary Physical Equations in De-Sitter Space. Let us take first the 
ordinary wave equation in space-time, 


(10) Ox = 0. 








: 
‘ 
? 
: 
: 




























660 P. A. M. DIRAC 


As the de-Sitter analogue of this equation, one might consider the tensor 
equation 

0 60 

oe ae an 
(11) OL, OX, x 
summed over the five values for». This will not do, however, since the operator 
in (11) is a differentiation process going outside the de-Sitter space, as may be 
seen from the fact that this operator does not commute with the left-hand 
side of (4). We therefore replace (11) by an equation whose operator is built 
up from the operators (7), namely the equation 


0 re) ) 7) 
(12) («, co > Ce — 2, 2x = 0. 


In order to see the approximate equivalence of (12) and (11), we transform the 
operator in (12) thus,— 


H(z 2 2 )(s a a ae 
nN a | ORIN eC eS eer ee! COCR a 





= Tyr . ae 1. ue : 
aes tc. . ee 
52 ss 
* ax, 
(13) ete ooo ee ee 
Ox, OX, Or, OX, OX, 


Hence if x is homogeneous of degree n, equation (12) becomes, with the help 
of (8), 
on) 
14 woth epee aang as dé 0: 
(14) (x02 2 —nt—an)x=0 
This takes the simple form (11) when n = 0 or —3. 

Let us now express the equations of the electromagnetic field in de-Sitter 
space. The 4-vector potential of the ordinary theory must be replaced by a 
5-vector potential A, satisfying (9). The usual condition for the vanishing of 
the four-dimensional divergence of the 4-vector potential must be replaced by 
the vanishing of the five-dimensional divergence of the 5-vector A,, thus 


(15) oso 
OL, 
From analogy with the usual theory, we should now expect to be able to 
introduce the electric and magnetic field quantities as an antisymmetrical ten- 
sor F,,, defined in terms of the potentials by 


0h: “Oh, 


16 Fy = ; 
(16) ~ ee, ae 








ELECTRON WAVE EQUATION 661 


It is clear, though, that this definition can be valid only if the degree of the 
potentials A, is suitably chosen, since if we change the degree of A, by multi- 
plying it by some power of x,x,/R*, we not only multiply F,, by the same 
factor, but we add on to it terms involving z,A, — x,A,, and thus change 
completely the form of equation (16). [Changing the degree of A, does not 
invalidate (15), since it adds to (15) a term involving x,A,, which vanishes on 
account of (9).] To determine the degree for the potentials A, which makes 
(16) valid, we note that 











aA, dA 
BaF pp = dy au, -— Xu “4 
aA 
= nA, — 2, = ; 


where n is the degree of A,. Now by differentiating (9) with respect to z, 
we find 


dA, 
Ox, 





(17) Lp +A,=0. 


Hence 
LyF ur = (n + 1)A,. 


Thus if n differs from —1, the potentials A, would be determined in terms of 
the field quantities F,, and we should have quite a different state of affairs 


from that of the usual Maxwell theory. We therefore take n = —1. 
We now have the equations 
(18) t,F yy» = 0. 


These reduce the number of independent components of F,, to six, the same 
as in the ordinary theory. At the point with codrdinates 0, 0, 0, 0, R, equa- 
tions (18) show that all the components of F,, for which » = 5 or vy = 5 vanish. 
The remaining components can be identified with those of the ordinary theory. 
Our choice of n = —1 gives us a principle of gauge invariance just as in the 
ordinary theory. We can replace the potentials A, by A, + 0x/dz,, where x 
is any homogeneous function of degree zero satisfying (12) or (11), without 
changing the field quantities or any other physically observable quantities. 
The further development of the theory follows closely that of the ordinary 
electromagnetic theory. One set of Maxwell equations, 
OFy, , OF w 


(19) oF LS 4. 
Oy me Ox, + O2y 








follows immediately from (16). The other set, namely 


(20) ead, 








$ 
: 
: 
; 



















662 P. A. M. DIRAC 


may be taken as the definition of the charge-current density vector j,. It is 
obviously gauge invariant and satisfies the equation of conservation of charge 


J» _ 9, 
OX, 


By differentiating (18) with respect to z, we obtain 
Luju - 0, 


showing that the vector j, defined by (20) lies in the de-Sitter space. Finally, 
we can introduce a stress-energy tensor G,, defined, as in the ordinary theory, by 


Gan = Fak uy = topFaka, 


and leading to the equations of motion 











= F, F — -§,,.F.a — 
._. =.” + faye ~ gre" ax, 
? oF, 1 Fa 
= fF, al —— — = 
ia + Fa 2 Oz, 


(21) = APF» 
with the help of (19). 


4. The Electron Wave Equation. Let us now pass over to the quantum 
theory. One of the fundamental ideas here is the method of expression of the 
momentum and energy of a particle as differential operators to be used in the 
wave equation. The corresponding method of expression in de-Sitter space 
would be 


= 


(22) Dy, = —th ss" 


giving us a 5-vector p, whose components determine the momentum and energy 
of the particle. Equation (8) gives us a linear relation between these com- 
ponents 


(23) Ty Py = —thn, 


when the wave function on which the operators (22) act is homogeneous of 
degree n. The vector p, lies in the de-Sitter space if the wave function is of 
degree zero. 

From the momentum (22) we can form the angular momentum 


(24) Muy = LpyPv — LrPy- 








ELECTRON WAVE EQUATION 663 


Its components are numerical multiples of the operators (7) and do not involve 
differentiations going outside the de-Sitter space. Thus the angular momentum 
(24) is rather more fundamental than the linear momentum (22). When using 
(24) we do not need to have the wave function homogeneous. 

It is possible to regard the components of the tensor (24) as including both 
the angular momentum and the linear momentum, and in this way to get a 
more satisfactory de-Sitter analogue of linear momentum than that provided 
by (22). If we consider the neighborhood of the point 0, 0, 0, 0, R, the com- 
ponents of m,, for which yw, v = 1, 2, 3, 4 become the ordinary angular mo- 
mentum components and the components m51, M52, 53, M54 become, apart from 
the factor R, the ordinary linear momentum components and the energy. 

This new interpretation for linear momentum leads one to consider, as the 
de-Sitter analogue of the ordinary electron wave equation in the absence of 
electromagnetic field, an equation of the form 


{ yOtyMy» — 2Rm}y = 0, or 
(25) 
{aya(%pPr — LyPy) — 2ARm}y = 0, 
m being the mass and the a’s being a set of anticommuting Hermitian matrices 
whose squares are unity, so that 


(26) Apa, + aya, = 2é,». 


We now need five a’s, instead of the four of the usual theory. It is possible 
to get five, however, without going to more than four rows and columns in the 
matrices.2 Thus the wave function of the present theory will still have four 
components. 

Let us examine the connexion between (25) and the ordinary wave equation. 
In the neighborhood of the point 0, 0, 0, 0, R equation (25) becomes, on divi- 
sion by R, 


{asQyD, — ayasp, — 2m}p = 0. 


This reduces to 
4 
{> 5A Pp — my = 0 
p=1 


which gives, on multiplication by tas; on the left 


3 

{ip + > 10t4 Oy Dp _ iavcam) y = 0. 
p=1 

This is of the same form as the ordinary wave equation, since the ip, here 

represents the energy [whether our space has the signature (1) or (2)], and the 

coefficients of the momenta pi, p2, ps and of m, namely iasau, tayo, tayo; and 

—towas are anticommuting Hermitian matrices whose squares are unity. 





*See Eddington, Proc. Roy. Soc. A 121, p. 524 (1928). 


LP ae 





! 
3 
i 
ie 
rs 
FA 





as eer 
ee eh to 

















664 P. A. M. DIRAC 
The fundamental equation (25) may be written in various different forms. 
We have 
Ay My Myr = Ay Oy Ly Dy — Ay O(PyL, + thbyy) 
(27) = ApLyAyPy — ApPypayt, — thayay 
= (ax)(ap) — (ap)(ax) — 5ih, 
using the scalar product notation. Now 
QyLyyPy + AvP ply = LyPp + PyXy 


all the terms for which v ¥ yu cancelling. Thus 


(28) (ax)(ap) + (ap)(ar) = (xp) + (px) = 2(xp) — Sth, 
and hence (27) gives 
(29) AyAyMyy = Zax)(ap) — (xp). 


Thus the wave equation (25) may be written 


(30) {(ax)(ap) — (xp) — Rm}y = 0. 
If we multiply this equation on the left by (ax), a matrix whose square is 
(31) (ax)? = (xx) 


we get still another form for it, namely 
(32) {(cx)(ap) — (ax)(xp) — (ax)Rm}y = 0. 


The extension of the theory to include the presence of an electromagnetic 
field may be made in the usual way, namely by substituting p, + eA, for p, 
in the wave equation, the A, being the electromagnetic potentials, which we 
discussed in the preceding section. Thus equation (25) becomes, in the pres- 
ence of a field 


(33) {a,e[r.(p, + eA,) — x,(py + eA,)] — 2Rm}y = 0. 


The condition we obtained in the preceding section, that the A, must be homo- 
geneous of degree —1, is satisfactory here, since the p, are also of degree —1 
and thus the homogeneity of the operator in (33) is preserved. 


5. The Hermitian Condition. In the following development of the theory 
we keep to the case of no electromagnetic field, in order to discuss the main 
ideas in the simplest possible way. The presence of a field would require only 
trivial alterations in the work of this section and the next one. 

_ One of the conditions of ordinary quantum mechanics is that the operator 
in the wave equation shall be Hermitian, or at least that one can make it Her- 
mitian by multiplying it on the left-hand side by some simple factor. Let us 





ELECTRON WAVE EQUATION 665 


see whether this condition is satisfied by our present wave equation. Multi- 
plying (25) by ass on the left, we get 


(34) {asasayar(XpPr — XyPy) — ZasasRm}y = 0. 


Now it is easily seen that (for u ¥ v) asasa,a, is Hermitian when yu and v both 
differ from 4 and 5 and when they equal 4 and 5, and that it is anti-Hermitian 
(i times a Hermitian quantity) when one of the suffixes yu, v equals 4 or 5 but 
not the other. Further, ass is anti-Hermitian. It follows that, for m real, 
the operator in (34) is Hermitian when our space has the signature (2), since 
in this case 24, 2s, Pps, Ps and FR are anti-Hermitian and the other z’s and p’s 
are Hermitian. Thus with the signature (2), our wave equation satisfies the 
Hermitian condition when m is real. 

The discussion with the signature (1) is not quite so easy. In this case we 
must work from the form (32) for the wave equation. We have 


R*(ap) — (ap)R? = 2ih(az), 
the R? here being written for (zx), for brevity. Again 
(ax)(xp) — (px)(wa) = (ax)(xp) — (xp)(ax) + [(xp) — (pz)](ax) 
th(ax) + 5ih(ax) 
6ih(azx). 


Thus (32) may be written 


{2[R?(ap) + (ap)R?] — 3[(ax)(xp) + (px)(ra)] — (ax)Rm — 2ih(ax)}y = 0. 
Multiplying by ia, on the left, we obtain 


(35) taa{3[R*(ap) + (ap)R?] — 3[(ax)(xp) + (px)(xa)] — (ax)[Rm + 2th] }y = 0. 


When our space has the signature (1), x, and p, are anti-Hermitian and the 
other z’s and p’s and R are Hermitian. It is now easily seen that the operator 
in (35) is Hermitian provided m is complex so as to make Rm + 2th real. Thus 
with the signature (1), our wave equation satisfies the Hermitian condition 
provided m contains a small pure imaginary part —2:iR-'. The real part of m 
can be arbitrary. 

If with the signature (1) we had tried to satisfy the Hermitian condition by 
working from the form (25) of the wave equation, multiplying it by za, on the 
left, we should have been led to the result that m must be pure imaginary. 
This would not be acceptable, as it would give a theory which does not go over 
into the ordinary theory in the limit when R tends to infinity. 


6. The Charge-Current Density. We must now set up an expression for 
the charge-current density satisfying the conservation theorem. Let us work 
with the form (25) of the wave equation. The adjoint wave equation is 


ae LRT 








? 
et 
% 

7 
4 
hf 


poies 
eS ee oe 





























666 P. A. M. DIRAC 


(36) b{aya(t.p, — tp») — 2Rm} = 0, 

in which p,, operating to the left, is to be interpreted as 
a 

(37) Pu — th Aa, ’ 


differing in sign from (22). Let y be any solution of (25) and ¢ any solution 
of (36), not necessarily representing the same state. Multiplying (25) by ¢ on 
the left and summing over the four values for the spin variable, we obtain a 
result which we shall write as 


(38) d.{a,a(xypr — Lypy) — 2Rm}y = 0. 

Similarly, multiplying (36) by y on the right and summing over the four values 
for the spin variable, we obtain a result which we shall write as 

(39) b{aya(ry.pr — Py) — 2Rm}.y = 0. 


The only difference between (38) and (39) is that in (38) the p, operate to 
the right according to (22) and in (39) they operate to the left according to 
(37). If we now subtract (38) from (39) we obtain 


-_ af 
th - Gaya, Xy,y — th a, ga,a,t,y =O or 





(40) 
Op, _ 
or, 
where 
(41) Pv = P(ayay — aay) Ty. 


This shows that we must take our charge-current density vector to be, apart 
from a numerical factor, the p, defined by (41). We then have the conserva- 
tion theorem (40). The 5-vector p, defined by (41) lies in the de-Sitter space, 
since 


(42) Lippy = G(a,a, — a,a,)z,7y = 0. 


We must take the numerical factor to be 4R— in order that our charge-current 
density may go over into the usual one when R tends to infinity. Thus we 
have for our charge-current density 


(43) py = Rd(a,a, — Ay Oy) yy = Ro dinate PO yA, yp. 


There is one density vector p, associated with any ¢ and y representing two 
states, whether these states are the same or not. The condition for ¢ and ¥ 
to represent the same state is that ¢y—! and y shall be conjugate imaginary, 
where y is the operator that one must multiply into the operator (25) on the 
left to make it Hermitian. Thus if the space has the signature (2), y = a5, 
and if it has the signature (1), y = ias(ax)R-. 





ELECTRON WAVE EQUATION 667 


The interpretation of the density vector p, is interesting. 


Let us take a small 


volume, localized at the point x, and defined by three small vectors a,, b,, Cy. 
These vectors must lie in the de-Sitter space and therefore satisfy 


tub, = 0 


(44) TQ, = 0 


CyCy = 0. 


The small volume will be represented by a 5-vector v,, whose components are 


given by equations of the type 


7\1= R- 


ae 
as 
a 
a5 


be 
bs 
bs 
bs 


The charge intersecting this volume is thus 


(45) My py = Ro 





a 
ae 
a3 
a4 
as 


C2 
C3 
C4 
C5 


Ci 
Ce 
C3 
C4 
C5 


Xo 
v3 
L4 
v5 


TM «pi 
2 p2 
T3 ps 
Ta Pa 
Zs ps 





Let us substitute for p, here its value given by (43) and pick out the coefficient 
of daa’ or —dagay. This will be 


a 
ae 
kR-? as 
a4 
a5 





b; 
be 
bs 
bs 
bs 


Ci 
C2 
C3 
C4 
C5 


X5 


—Xe 
v1 
0 
0 
0 





If in this determinant we add to the first row x2/x; times the second row, 23/21 
times the third row, 24/2; times the fourth row, and 2;/z; times the fifth row, 
we get, with the help of (44) 


0 00 R/x 0 

a2 be Ce Xe v1 a3 bs Cs 
R-? a3 bs C3 Z3 0 = |a, ba Co 

a, bs Ch X4 0 as bs Cs 

a5 bs C5 X5 0 








Thus the total expression (45) for the charge intersecting the volume becomes 








a fb Cy a Mm 

Qo be Co Qe Qe 
(46) Up Pp = ox a3 bs C3 a3 3 y 

Qs bs Cy Oy O% 

a5 bs Cys as as 


where the order of the factors given by the last two columns is to be preserved 
in the evaluation of the determinant. 





it 
4 
+4 
i 








os 
peda too a 


30 
fi 
Biers 
. 
fi 
: j 
aes, 48 
ei 
Bt 
a; 
; 
1 
i Sas 
; Hj 
qi 
\ 4 - 
ae ; 
2 hams] 
i 
it / i 
: 1ooat 
ey | : 
} +434 
ey } 
Te: We 
Fi 3 | 
Pad 
t : 
at i 
e 4 
to 4 { 
‘ 
| L 
ea a 
; { 
$ 
Bay‘) 
+s] 
ai) a 
vt 
Be, 


ae 


— 
Sere = 


ie 

















666 P. A. M. DIRAC 


(36) o{a,a,(t,p, — t,py) — 2Rm} = 0, 

in which p,, operating to the left, is to be interpreted as 
a 

(37) Py, = th ee 


differing in sign from (22). Let y be any solution of (25) and ¢ any solution 
of (36), not necessarily representing the same state. Multiplying (25) by ¢ on 
the left and summing over the four values for the spin variable, we obtain a 
result which we shall write as 


(38) b.{ayar(XpyPy — LyPy) — 2Rm}y = 0. 


Similarly, multiplying (36) by y on the right and summing over the four values 
for the spin variable, we obtain a result which we shall write as 


(39) o{a,a,(zup, — X»Py) — 2Rm}.y = 0. 


The only difference between (38) and (39) is that in (88) the p, operate to 
the right according to (22) and in (39) they operate to the left according to 
(37). If we now subtract (38) from (39) we obtain 


ac ae 
th 7 Gy a Lyy — wa ga,a,2,y = 0 or 


(40) 
ap, _ 
ar, 
where 
(41) Py = Paya, — aay) Ty. 


This shows that we must take our charge-current density vector to be, apart 
from a numerical factor, the p, defined by (41). We then have the conserva- 
tion theorem (40). The 5-vector p, defined by (41) lies in the de-Sitter space, 
since 


(42) Lvpr = Playa, — a,ay)r,ry = 0. 


We must take the numerical factor to be }R-! in order that our charge-current 
density may go over into the usual one when R tends to infinity. Thus we 
have for our charge-current density 


(43) py = ZR $(a,a, — a,a,)t,y = Ro Suite Poy Ly LpY. 


There is one density vector p, associated with any ¢ and y representing two 
states, whether these states are the same or not. The condition for ¢ and ¥ 
to represent the same state is that ¢y—! and y shall be conjugate imaginary, 
where ¥ is the operator that one must multiply into the operator (25) on the 
left to make it Hermitian. Thus if the space has the signature (2), y = 5 
and if it has the signature (1), y = ias(ax)R-. 


ELECTRON WAVE EQUATION 667 


The interpretation of the density vector p, is interesting. Let us take a small 
volume, localized at the point x, and defined by three small vectors a,, b,, Cy. 
These vectors must lie in the de-Sitter space and therefore satisfy 


rub, = 0 


(44) TA, = 0 


t,.C, = 0. 


The small volume will be represented by a 5-vector v,, whose components are 


given by equations of the type 


y= Ro 


ae 
a3 
a 
as 


be 
bs 
bs 
bs 


The charge intersecting this volume is thus 


(45) Up Py = Ro 





ay 
ae 
a3 
a 
a5 


C2 
C3 
C4 
C5 


2 
v3 
v4 
vs 


by 
be 
bs 
bs 
bs 


Ci 
C2 
C3 
C4 
C5 


Tv 
Le 
Z3 
Xs 
X5 


Pi 
P2 
P3 
pa 
Ps 





Let us substitute for p, here its value given by (43) and pick out the coefficient 


of dajasy or —dazsayW. This will be 


a@ b G& M% —2e 
ae be Co Ze T1 
R-? a3 bs C3 2X3 0 
a4 by Cy 0 
as bs cs t% 9 








If in this determinant we add to the first row z2/z; times the second row, 23/21 
times the third row, z4/z; times the fourth row, and z;/z times the fifth row, 
we get, with the help of (44) 





0 0 0 R/x 0 
Gp be Ce Se: az b3 C3 
R? | as bs Cs Z3 O| =| as ba Ca 
Qs ba C4 X4 0 ads bs Cs 
as bs C5 rs 0 
Thus the total expression (45) for the charge intersecting the volume becomes 
a, by Cy Qi Qy 
a2 be Co Ag 2 
(46) Un Py = x a3 bs C3 a3 3 
a4 bs Ce Gye % 
a5 bs Ch 5h 5 











where the order of the factors given by the last two columns is to be preserved 


in the evaluation of the determinant. 








i 
_' 
: 


Sg 






















668 P. A. M. DIRAC 


7. Constants of the Motion. A further problem of interest is to determine 
the constants of the motion for our electron in de-Sitter space in the absence 
of electromagnetic field. A constant of the motion is an operator having the 
mathematical property that, when multiplied into any y satisfying the wave 
equation, it produces another y satisfying the wave equation. A sufficient (but 
by no means necessary) condition for this is that it shall commute with the 
operator in the wave equation, for any one of the forms in which the wave 
equation may be written. 

Let us look for things commuting with the operator in (25), namely the 
operator 


aya,M,, — 2Rm = 2H 
say. We have, using well-known commutation relations of the type 


[m12, m3| = M23, 


[m12, H| = [mie, } | Oy Xy Myr] 


= 1 A3M23 + 14M + AL A5M25 — A2A3M™M13 — AzagMi4 — AZA5M 5. 
Again, from (26) 


thi a as, H| = @Q2 » Fits Ay AyMyy — ie } ae Ay Ay Myy A) He 


= 2{— aza3Miz — a204Mi4g — A2A5Mi5 + A1A3Meo3 + A1A4Ma4 
+ a a5M25;. 
Thus 
[mi2 — Ztha,a2, H| = 0, 


showing that m2. — }iha,az is a constant of the motion. More generally, 
My, — Ztha,a, is a constant of the motion. This shows that —}iha,a, is to be 
interpreted as the spin angular momentum, which must be added to the orbital 
angular momentum to give a constant of the motion. 

We have seen that at the point 0, 0, 0, 0, R the components m;, (with 
u = 1, 2, 3, 4) of m,, are to be interpreted as R times the components of 
linear momentum and the energy. Thus the quantities —}ihasa,R~ (with 
uw = 1, 2, 3, 4) are to be interpreted as the components of a spin linear mo- 
mentum and a spin energy. 


8. The Second-Order Wave Equation. Let us now eliminate the a matrices 
from the wave equation for no field and obtain a second-order wave equation 
for each of the four components of the wave function. Multiplying (25) by 
i(a,a,m,, + 2Rm) on the left, we obtain, since R commutes with the m,,, 


(47) { (Faya,myy)? — R?m?}y = Q. 


ELECTRON WAVE EQUATION 669 


Now from (29) and (28) 
(}a,a,My»)? = (ax)(ap)(ax)(ap) — (ax)(ap)(xp) — (xp)(ax)(ap) + (xp)? 

= (ax)|—(ax)(ap) + 2(xp) — 5ih\(ap) — (ax)[(xp)(ap) — ih(ap)] 

— [(ax)(xp) — th(ax)|(ap) + (xp)? 
—(xx)(pp) — 3th(ax)(ap) + (xp)?. 
Hence (47) becomes 
{(xx)(pp) + 3th(ax)(ap) — (xp)? + R’m*}y = 0, 

which reduces to 

{(xx)(pp) — (xp)? + 3ih(xp) + 3ihRm + R’?m*}y = 0 
with the help of (30). From (13), which may be written 


3My»My, = (xx)(pp) — (xp)? + 3th(xp), 
we get finally as our second-order wave equation 
(48) {3my»My, + R?m? + 3:hRm}y = 0. 


It should however be noticed that the operator in this wave equation lis inot 
Hermitian, when our space has the signature (1) requiring R and m + 2i:hR™ 
to be real. The components of our wave function y¥ are then not solutions 
of the wave equation (12) generalized by the addition of a real mass term 
in the operator, as one might expect from direct analogy with the ordinary 
theory. The small non-Hermitian term in (48) causes an exponential decrease 
or increase in each component of y. This does not lead to a contradiction to 
the conservation law, which we have already proved, since the matrix repre- 
senting the charge density is not of the usual positive definite form, although it 
approximates to this form inside a small region of space. 


I would like to thank Prof. Wigner for his help in this work, in particular for 
obtaining the Hermitian condition with the signature (1). 


Princeton, N, J. 























ANNALS OF MATHEMATICS 
Vol. 36, No. 3, July, 1935 


A SET OF INDEPENDENT POSTULATES FOR PROPOSITIONAL 
FUNCTIONS OF ONE VARIABLE 


By B. Notcutr 


(Received March 28, 1934, Revised August 15, 1934) 


The following calculus of propositions is based in its essentials upon that given 
in Principia Mathematica *10. But it differs from the latter in several important 
respects. The primitive propositions of Principia are not completely formal- 
ised, for they contain such words as “proposition,”’ “function,” etc., which refer 
to the meaning of the symbolism, and without an understanding of which the 
calculus cannot be developed. The reduction of the primitive propositions of *1 
to an abstract set of postulates has been understood for some time.! Our paper 
extends this abstract postulation to the functional calculus. In reformulating 
our postulates we have followed Russell and Whitehead as closely as possible; 
we have retained symbols which are real variables representing individuals and 
predicates, which form the elements of the system; conjunction, negation, 
implication etc. still appear as operations on these elements. Thus we have not 
followed the more revolutionary methods of Schénfinkel? and Curry? in which all 
variables are eliminated, and the logical operations themselves become the 
elements of the system. 

Our system differs from that of the authors of Principia in that the postulates 
can be understood without interpreting the elements and operations as the 
undefined terms of intuitional logic; and from that of the formalists in that our 
postulates and theorems concern, not the symbols themselves, but the objects 
which they symbolise; though their interpretation is not prescribed within the 
postulates. 

The chief obstacle to this formal reduction is the peculiar nature of the 
apparent variable. For in the functional calculus, in addition to propositions of 
the form $2 (zx has the property ¢) we need expressions of the form (x) - gx (every 
x has the property ¢). In Principia this is treated as an operation on ¢z, and 
called generalization of ¢z. Admirable though this symbolism is for practical 
use, it raises a theoreticaldifficulty. For the “x” in gz directly represents some 
object, while in (x)-¢x the x does not directly represent an object, but rather a 
collection of objects. So long as the symbolism contains apparent variables, 
it is impossible to reduce the primitive propositions to an abstract set of postu- 





1 See especially E. V. Huntington’s “‘Postulates for the Algebra of Logic’’ (sixth set), 
Trans. Am. Math. Soc., Vol. 35. 

* Math. Annalen, Vol. 92 (1924), p. 305. 

3 Amer. Journal of Math., vol. 52 (1930), p. 509 and p. 789, Annals of Math., vol. 32 (1931), 
p. 154, and vol. 34 (1933), p. 381. 


670 





INDEPENDENT POSTULATES FOR FUNCTIONS OF ONE VARIABLE 671 


lates, for while the symbols mean their objects in two different ways, their mean- 
ing cannot be disregarded. Our first task is, therefore, the elimination of the 
apparent variable.* This is achieved by treating generalization not as an oper- 
ation on a proposition ¢z, but as an operation on a predicate ¢. Thus, “for 
some 2, oz is true’’ is written E9¢; and “¢ is true of every z,”’ is written G¢.5 The 
new postulates 3, 4, 7, 8, are made necessary by this change. The much dis- 
cussed sign “‘+’’ has been retained with a new, but equivalent meaning. 

Three kinds of elements will be required—individuals, or subjects of propo- 
sitions, represented by z, y, --- ; predicates of propositions, represented by 
¢,¥,--+; and propositions, represented in general by p, q,---, or in par- 
ticular by ¢z, or by Eg. A subclass of propositions will be indicated, which 
are tautologies, i.e. which are always true no matter what significant values are 
substituted for the variables. These four classes are indicated in our symbol- 
ism by A, B, C and T respectively. 

Four operations are required—predication, between a predicate and an 
individual; disjunction, between two predicates, or between two propositions; 
negation, applied to a predicate or to a proposition; generalization, applied to a 
predicate. 

No postulates have been included concerning the theory of types, for since 
this refers to the meaning of the symbols, it belongs rather to their interpretation 
than to the development of the calculus. 


Base: a class K, containing three mutually exclusive classes of elements, 
A, B, C, and a subelass T of C. 
Operations: monadic: E, —. 
dyadic: v, predication (indicated by simple conjunction). 
Postulates: 
1. If ¢isa B-element and z is an A-element, then ¢z is a C-element. 
2. If ¢is a B-element, then E¢ is a C-element. 
3. If gis a B-element, then —¢ is a B-element. 
. If ¢, ¥ are B-elements, then ¢ v y is a B-element. 
. If pis a C-element, then — p is a C-element. 
. If p,q, are C-elements, then p v q is a C-element. 
If 1-6, then: 
7. (—)y is the same element as — (@y). 
8. (pv y)yis the same element as oy v Yy 
(“pis a T-element’’ will be written “+ p.’’) 
9+ —(pvp)-v-p 


o> 


jor) 





‘ The alternative policy of eliminating real variables and retaining only apparent vari- 
ables, has been adopted by Alonzo Church. See ‘‘A Set of Postulates for the Foundation of 
Logic,’’ Annals of Math., vol. 33, p. 346 and vol. 34, p. 839. 

* The operations 2 and II used by Church and Curry have the same effect. 














' 
$ 
PI 


Te 
oe 
Be 


> Sees. bez 
ihe 


ie erneye ener ee ew 


ee 


Ld 
RS SES 


Se 














of the consistency of the ‘“‘engere Funktionenkalkil.”’ 


672 B. NOTCUTT 


10. }—q-v-pvq 

11. }—(pvq)-v-qvp 

12. + —(—qvr)-v:—(pvqg)-v-pvr 
13. If } pand | —pvgq, then + g 


—oy = df. — (dy) 


14. +} —oy v Ed 
E(p v o)- = df. pv Eo 


—E~(¢v p)- = df. —E-—ov p 
E(¢v p)- = df. E¢v p 
—E-(pv?¢): = df.pv —E-@ 


15. If kp v —qy, then } pv — £4, provided that y does not appear as a con- 
stituent of p. 
16. If } p, then not + —p. 
Consistency and independence of the postulates 
The following system* satisfies all the postulates: 
K = 0, 1, 2, 3, 4 A=0 B=1,2 C = 3,4 T=4. 


(“pr’’ stands for predication.) 








prj/0 12 3 4/E vi0 12 3 4/— 
0 : 0;. ; , 
Riitiosiccnw ithe Paes eer 2 
Oo) eer, BR Sed seh 
3 3]. 3 4/4 
4]. 4]. 4 4] 3 














That 7-16 cannot be deduced from 1-6 is obvious, since no mention of T- 
elements occurs in the preliminary postulates. That 1-6 be not deducible from 
7-16 can easily be insured by the provision of suitable hypotheses preceding 
7-16 severally. Since this method is well known, it is summarized in the general 
expression ‘‘If 1-6, then.”’ 

Independence systems for 1-6 are obtained by omitting suitable areas from 
the consistency system given above. Thus, System 1 consists of the con- 
sistency system unchanged except that the ‘“pr’’ table is altogether blank. 

To save space, tables for A- and B-elements, and for C-elements will some- 
times be given separately; this is possible because no operations occur between 
A- or B-elements, and C-elements. 





6 Hilbert and Ackermann: ‘‘Grundztige der theoretischen Logik,” p.65 give a Similar proof 





(7). 


(8). 


E=G34233 A=vj0 Bel C=23 T23. 
pr}O 1| #\— vio 1 ‘vi2 3/— 
0/. ts l= 2/2 3/3 
1/3 3/1 oe 3/3 3/2 
(—¢)y = 3, but —(¢y) = 2. 
KuwGERSBERS Bed OHS Cut Tou 
pr}O 1 2) #\|— vj0O 1 2 vj3 4/— 
0}. a 0 og 3/3 4/4 
1/3 3 | 2 1 1 2 4/4 4/3 
2/4 4; 1 2 1 1 
¢ = 2, = 2, gives (¢ v y)y = 3, but gy v yy = 4. 
(9). K = 0,1,2,3,4,5,6 A=0 B=1,2,3 C#=4,5,6 
pri0 1 2 3/#|— vi0 1 2 8 vi4 5 6\- 
0|. Alia _ ) ee ee 41/4 5 6/5 
1/4 4|\|2 lj]. 12 8 5/5 5 5/4 
2/5 5| 1 iro oo ae 6}/6 5 5/6 
3 | 6 6| 3 3/. 3 2 2 
p = 6 gives —(pv p)-v-p=6. 


INDEPENDENT POSTULATES FOR FUNCTIONS OF ONE VARIABLE 









































673 


6. 


T= 


5. 


(10). K = 0,1,2,3,4,5,6,7,8 A=0 B=1,2,3,4 C=5,6,7,8 T=6. 


(11). K = 0,1, 2,3, 4, 5, 


mPwonwreo|]F 


012 3 4 











on Do: 


0123 4 


E 


OND: 








E 





ono or: 





ono: 






































-_ Ti@ 1 33 4 vi5 67 8/— 
; 0 oS aa 5|5 6 5 5/6 
2 1 . 2.4.2 616 6 6 6/5 
1 2 $232 32 71/5 6 5 6/8 
4 3 it 2 815 6 6 5/7 
3 4 1221 
6,7,8 A=0 B=1,2,3,4 C=5,6,7,8 T =6. 
_ T1¢- t FF 8 € vidi 67 8/i-— 
0 is us Sg ne 5|5 6 7 8/6 
2 1 123 4 616 6 6 6/5 * 
1 2 2222 71\7 67 6/8 
4 3 323 2 81/5 6 6 8|7 
3 4 1224 


























The systems for 9-12 we have adapted from P. Henle’s paper, “The inde- 
pendence of the postulates of logic,’ Bull. Am. Math. Soc. Vol. 38. In each of 
Henle’s systems the law + p v —p holds; accordingly, if we give E@ the same 
15 is satisfied by insuring that —H—¢ has the 
same value as ¢y. 7 and 8 are satisfied by making the B-systems for v,—, 


value as oy, 14 will be satisfied. 




















reproduce the structure of the C-systems. 


(13). K = 0,1, 2,3, 4 


pr 
0 
1 
2 





w-: 





E 


3 
4 





(14). K = 0,1,2,3,4 


pr 
0 
1 
2 





0 


1 


2 


E 


3 








A=0 B 


I 
hos 
bo 





A=0 B#=1,2 





1 
2 


14 fails when ¢ = 


2 
2 
2. 








674 B. NOTCUTT 
(12). K = 0, 1, 2, 3, 4, 5, 6, 7, 8,9, 10,11,12 A=0 B=1,2,3,4,5,6 
C = 7,8,9,10,11,12 T=8 
pr| 0 4 5 6|Ei— ¥1i6123.2 4 6.6 
0 a 0 hae © a 
1 7 7i32 1 133 488 
2; 8 8 il 2 2 2a Ss 
3); 9 9/4 3 3 2c:% &§ 32 
4/10 10 | 3 4 42232465 6 
5/11 11 | 6 5 $:°2°S"s § 3 
6 | 12 12 | 5 6 '32386?s6 
V 7 8 9 10 11 12] — 
7178 9 Bh Bis 
8} 8 8 8 8 8 8] 7 
9|'98 9 8 8 8} 10 
10; 10 8 8 10 11 12] 9 
11/11 8 8 11 11 8) 12 
1221:1223 88 12 8 Wilt 


C = 3,4 T=4 
vi3 4|— 
31:4 4/4 
4/4 41/3 
C = 3,4 T=4 
v|/3 4/— 
313 41/4 
414 413 


INDEPENDENT POSTULATES FOR FUNCTIONS OF ONE VARIABLE 675 


(15). K = 0,1, 2, 3,4 A=0 B=1,2 C = 3,4 Pa 4 




















pr} 0 1 2/£\- Ti@ & F a a! 
se Fg) Je 0 ; 3/3 4/4 
ak eee Fa 1 1 2 41/4 4/3 
$i¢. 2a 2 3 2 
15 fails when p = 3, @¢=1. 
(16). K = 0,1, 2 A=0 B=1 C=2 T=2 
pr}0 1 2;EF|— vi0 1 2 
_' Oa +e 74 ae 
a a ee ie a SO Ee 
2 2 2 2 














Without (16) we cannot prove that p is distinct from — p. 


Theorems 


All the theorems of the ordinary algebra of logic can be proved from our 
postulates 5, 6, 8-11. It was proved by P. Bernays that Principia *1.5 is 
redundant; and with the omission of this, our set is equivalent to that of Prin- 
cipia. We shall, accordingly, assume the theorems that can be derived from 
these postulates; and after proving two general theorems not given in Principia, 
we shall proceed at once to the derivation of theorems concerning ¢z and F¢. 
For theorems from the algebra of logic we shall give references to Principia *1—*5. 


pq: = df. -—pvq @¢>y- = df. —¢vy. 
p-q- = df. —(—pv —q) o-y- = df. —(—e¢v —y). 
p=q-=dfi.(~p3q-qDp) o=¥- =di. (6 5¥-¥ > 4). 


Tu. 18.1. If | pand | q, then | p-gq. 
*3.2:p-D:q-D-p-q Hence by 13 and Hp., Prop. 
Tu. 18.11 If | p-q, then | pand } gq. 
This follows from *3.26 and *3.27. 
Tu. 18.2 If | p = q, then if q be substituted for p, or p for q, in any T-element, 
the result is a T-element. 
By 18.11 if } p = q, then } p Dg. 
Let p v r be any T-element containing p. 
*2.53: -/pvr-D-—p Dr. Hence by Hp. and 13, | —p Dr. 
Hence by Hp. and 18.1, } p Dq-—p Dr. 
3.48: +p Dq-—pDr:D:ipv —p-D-qvr. 
Hence by Hp., 13, and *2.11: } gv r. Similarly for p v r. It has been 








aes a eer pn cee 






































676 B. NOTCUTT 


shown by Huntington (loc. cit. p. 297) that we cannot prove from p = q that p 
is equal to q; but this more limited theorem is sufficient for our purposes. 
God. = df. —E-—@ 
Tu. 20 } dyv E — ¢. 
14: | — (—o)yvE — ¢. 
By 7: | — —(¢y) v# — ¢. Hence by *4.13 and 18.2, Prop. 
Tu. 20.1 | Gd- D- gy. 
20 and *2.53 
Tu. 20.11 }Hov E —@ 
By *3.83, | Go- D- dy: gy: D-Ed-: D:GgDE¢. 
Hence by 13, 14, 20.1: }G@ D Eg. Hence by *4.64, Prop. 
Tu. 20.12 If + pv oy, then | p v G¢, provided that y does not appear as a 
constituent of p. 
By 15, putting —¢ for ¢, *4.13 and 18.2. 
Tu. 20.2 | Go-Gy- D- gy-py. 
By 18.1 and 20.1: | G@ D dy-Gy D Wy. 
Hence by *3.47 (p D r-¢g D s- Dip-q-D-r-s), Prop. 
Tu. 20.21 | Go-Gy: = :G(¢-yp). 
By 7, 8, 20.12, and 20.2, + G@-Gy: D-G(g-y) --- (1). 
+ G($-¥)-D-(o-Wy. 
By 7,8 :D-oy- Wy. 
*3.26 :D-oy. 
20.12: D-G¢. 
20.12 and *3.27: D-Gy. 
18.1 and 3.43: D-Gqo-Gy --- (2). Hence Prop., by (1), (2), and 18.1. 
Tu. 20.3 | G(d D y)-dy: D-W. 
By 20.1 and 8: | G(¢ Dy)-D-dy D wy. Hence by *3.31, Prop. 
Tu. 20.4 | G(¢ D y)-Go:D-Gy. 
By 20.2: | G(¢ D y)-Go: D: dy D Wy- dy. 
By *3.35 and * 3.33, : D:yy. Hence by 20.12, Prop. 
Tu. 20.41 | G(¢ = y)- D-Go = Gy. 
By 20.4, *3.3, | G(@ Dy)-D-Ge D Gy. 
Hence by 18.1, *3.47, | G(@ Dy) -G(y D 6): D:Go D GY- Gy D Go. 
Hence by 20.21, 18.2, Prop. 
Tu. 20.42 | Ed = E— —¢. 
By 14, | ¢z > E¢. 
*2.16, } —Eo-D-—¢x. 
20.12 ->.-—E — —¢. 
*2.16 }E-— —¢-D-Eo_ ..-.- (1). 
By 20 } E— —@-v- —y. 
15 | E— — ¢-v-—Eg@. 
*2.16 }Eo-D-E~— —¢.-.-.- (2). 
Hence by (1), (2), Prop. 
Tu. 20.5 + G(¢@ D y)-D-Eo DP Ey. 
+ G(@ D )-D-dy D We. 





INDEPENDENT POSTULATES FOR FUNCTIONS OF ONE VARIABLE 677 


By *4.1 -D-—yy D —oy. 
20.12, 20.4, -D-G—yOG —- ¢. 
20.42 -3:—EHy-D-.—Eg¢. Hence, by *4.1, Prop. 


Tu. 20.6 | G@@ = ¥)->-Eo = Ey. 
By 18.1, 20.21, and 20.5. 


$=) =d.G@=¥) $Dy¥- =df.G¢>y) 


Upon these definitions is based the whole calculus of classes; the following 
important theorem is the analogue of 18.2, justifying the use of the symbol 
Tu. 21 Jf + = y, then af in any T-element y be substituted for , or ¢ for y, 
the result is a T-element. 

@ may appear in a T-element in either of two ways, as ox or as Ed. Let 
pv ox be any T-element containing ¢ as ¢r. | G(d = ¥)-D-¢or = Yr. Hence 
by 18.2, - pv ya. 

As the subject of an E-operation, ¢ may appear in the following ways: 


E(¢ v x) Vv p. 
E—(¢v x) vp. 
—E(¢ v x) v p. 

—E-(¢v x) vp. 


(All other cases, e.g. p v E(¢ v x), E (—¢ v’x) v p, follow readily from these.) 
| Go = y):D-gr = yx. 
By *4.37-D-oxvxx- = -Wrv xe. 
20.12, 20.41 -D-G(¢ v x): = - GWvx) --- (I) 
20.6 ->-E(¢v x): = - EWvx) --- (2) 
By (1), (2), and 4.11, the four cases are covered. 
Hence by 18.2, Prop. 
It is now possible to prove any of the theorems of the calculus of classes. 

In order to extend the calculus to include dyadic relations, we use Schén- 
finkel’s device, whereby a function of two variables is treated as a function of 
one variable, which itself can take a variable as an argument. 

Additional base: a class B, of elements ¢; an operation F such that: 


If ¢ is in By, and z is in A, then ¢z is in B, 
If ¢ is in Bz then G¢ is in B, 

+ ory = F(¢y)z, 

If gy is in B, then | FFoy = Foy. 


A similar method can be used to extend the calculus to any number of vari- 
ables: 

























678 B. NOTCUTT 


Additional base: Classes Bz, B; --- 
that: 


B, --- of elements; an operation F, such 


If ¢isa By, or isa Bz, (n > 1), 
F.(ox) = df. F[F(¢z)] etc., 
If ¢ is a Buys, then | Fay», oz = For. 


A subscript to ¢ may be used to indicate its degree, i.e. the number of operations 
required to construct a proposition from it. 


+ (Pmdnsi U)YiY2ys +++ Yn = PYY2 +++ YmEYmi2 +++ Yn. 


Postulates 3, 4, 7, and 8 are also extended to functions involving more than one 
individual as a variable. 

The significance of these postulates may be made clear by translating them 
into a more obvious and familiar symbolism. 

Fox may be written ¢(—,2); and since ¢ here represents some dyadic relation, 
F $x will mean ‘the things that stand in the relation ¢ to 2’. 
Similarly, 


' Fo(o3x) means ¢(—; —2 2). 
[F2(osx)|y means $(y —; 2). 
F(G¢3(y)] means (x) -¢(2 —1 y). 


A function of n variables has n empty places which require to be filled if we 
are to construct a proposition from it. These places must be filled either with 
names of individuals (A-elements), or with apparent variables (in our system 
E-operations). The F-operation enables us to indicate which places are to be 
filled by which elements or generalizations. In ¢,2z the first place, reading from 
the left, is filled by xz; in Fn@»r the (m + 1)* place is filled by x, unless m 2 1, 
in which case the n* place is filled by z. 

Our F-function is similar to Schénfinkel’s T-function and Curry’s C-function; 
both these letters have already been given special meanings in our system, and 
could not be used here without confusion. 

Our system does not include the use of functions as subjects of propositions, 
or as apparent variables. We hope to study this extension in a later paper. 


STELLENBOSCH, SoutH AFRICA. 


ANNALS OF MATHEMATICS 
Vol. 36, No. 3, July, 1935 


ON THE COVERING OF A COMPLETE SPACE BY THE GEODESICS 
THROUGH A POINT! 


By J. H. C. Wuirrereap 
(Received December 12, 1933. Revised November 6, 1934) 


1. Introductory. This paper is about the covering of a differential metric 
space F’,, by the geodesics through a given point O. The space is an analytic 
manifold? of n dimensions with a metric* defined by a differential invariant 


ds = f(P, dP). 


In coérdinates z, 
f(P, dP) = F(2', +++, a", dz', --- , dz"), 


where F is an analytic function of x and dz satisfying further conditions given 
in §2. In §3 it is shown that these, together with a completeness condition for 
the space as a whole, are sufficient to ensure that each geodesic can be produced 
indefinitely in both directions‘ and that any ordered pair of points are joined 
by at least one oriented extremal of minimum length. 

Because of the properties described above a normal coérdinate system at a 
given point O can be extended to define an analytic transformation y — P, 
of the entire tangent Minkowski space at O, which we call M,, into F,, the 
space M,, covering the whole of F,.. The points in M, at which y — P fails 
to be locally (1.1) are those which are conjugate to O with respect to suitable 
geodesics. They constitute an analytic complex which we call the conjugate 
complex. 

Another important locus in M, is defined by analogy with a locus introduced 





1 Revised Oct., 1934. I wish to thank certain Princeton mathematicians, in particular 
Prof. M. 8. Knebleman, for criticisms of the original script. These related to the original 
§2 and have led to the completion of Mayer’s existence proof for normal coérdinates. 
Consequent on a series of conversations with Prof. Marston Morse I have also taken the 
opportunity to modify §3. 

* Cf. O. Veblen and J. H. C. Whitehead, ‘The Foundations of Differential Geometry,’ 
Cambridge Tract no. 29 (Cambridge, 1932), chap. VI. 

* Metric geometry of this kind was first studied as a generalization of Riemannian 
geometry by P. Finsler (Géttingen Dissertation, 1918). See also L. Berwald, Jahresbericht 
der Deutsch Math. Verein., 34 (1925), 213-20, and Atti del Congresso Internazionale dei 
Matematici, 6 (Bologna, 1928), 263-70. 

‘Our §3 should be compared with a paper by Hopf and Rinow, Commentarii Math. 
Helvetici, 3 (1931), 209-25. Though they are dealing only with 2-dimensional Riemannian 
geometry their methods can easily be extended to the type of space considered here. 


679 








no a Nat 











680 J. H. C. WHITEHEAD 


by Poincaré in a paper on surfaces of positive curvature.’ Let c be a geodesic 
through O which is not the absolute minimum arc joining O to every point on it. 
There is a last point P on c such that the arc of c joining O to P is at least 
as short as any other curve joining O to P. The locus in question is swept 
out by P as ¢ varies. 

The object of this paper is to prepare the way for a detailed study of these 
two loci, and of the transformation y — P near a point on either of them. 

In the two-dimensional case there is a certain amount of literature on the 
conjugate locus, most of it referring to geodesics on a surface,® while M. Mason 
and G. A. Bliss have considered what we call an ordinary neighbourhood of the 
conjugate locus in three dimensions.’ Among the more recent writings in this 
field we refer, above all, to the work of Marston Morse on the calculus of 
variations in the large.® 


2. Normal codrdinates. The length of a vector dz is given in codérdinates 
x by 
ds = F(x, dz) 
and the square of its length by 
ds? = g;;(x, dx)dx‘ dz’ , 
where 


a ee 
Ii = 3 adxt adzi 


The function F is to be analytic in x and for all non-zero values of dz, positively 
homogeneous of the first degree in the latter, and is to satisfy the conditions 


(2.1) F(z, dx) >0, g=l|9;|40, 


for all points x in the region of definition and all non-zero vectors dz. These 
conditions imply® 


E(x, dz,éx) 20 and 4g,;(x, dx)éz'‘ dxi > 0 





5 Sur les lignes géodésiques des surfaces convex, Trans. American Math. Soc., 6 (1905), 
237-74. 

6 See, for example, G. A. Bliss, The geodesic lines on an anchor ring, Annals of Math., 
4 (1903), 1-21. 

J. W. Lindberg, Zur Theorie der Maxima und Minima einfacher Integrale mit bestimmten 
Integrationsgrenzen, Math. Annalen, 59 (1904), 321-31. 

H. Poincaré, loc. cit. 

B. F. Kimball, Geodesics on a toroid, American Jourr.al of Math., 52 (1930), 29-52. 

7 Trans. American Math. Soc., 9 (1908), p. 453. 

8 For an account of his recent work see his article in the ‘Verhandlungen des inter- 
nationalen Math.-Kongresses Ziirich,’ 1932, pp. 173-88. 

* J. H. C. Whitehead, Quarterly Journal of Math. (Oxford Series), 4 (1933), 291-6. 





COVERING BY GEODESICS 681 


provided dx # 0, 6x ~ 0, where E is the Weierstrass function, and E > 0 if 


bx * dz. 
From the homogeneity in dz we have the relations 





9965 ai — Wi Goi — 
(2.2) adic! dz! = adzi dzi = 0. 
The geodesics, or extremals of the integral 
f F(z, dz), 
are given by differential equations of the form 
dx ; dz 
(2.3) ae 7 H (=, *) , 


where ds is the element of arc and 
Hi(z, 4€) = WHi(z, —) if ADO. 


We shall use both terms ‘geodesic’ and ‘extremal,’ the latter being used for an 
are of a geodesic joining two specified points. We do not confine ourselves to 
the reversible case [i.e. F(z, dx) and H(x, dx) are not necessarily the same as 
F(z, — dx) and H(z, — dx)]. Therefore an extremal is to mean an oriented 
curve. 

The functions H(z, £) are analytic in z and for all non-zero values of ¢. The 
derivatives H; exist and are zero when ~ = 0, as may be verified from first 
principles. It follows that they are continuous in z and ¢ for all values of the 
latter. 

From an argument given by W. Mayer” it follows that the solutions 


xi = x‘(q, p, 8) (s 2 0), 
of the equations (2.3) determined by the initial conditions 
(2.4) xi, p,9) =, =x", p, 0) = B', 
are of the form 
(2.5) v= fig, y), 


where y' = pis. Since the functions H(z, ) are analytic when é ~ 0 the func- 
tions f(q, y) are analytic" in gq and y for non-zero values of the latter. Writing 
p' = 6; and letting s — 0, it may be verified from first principles (cf. Mayer loc. 
cit. p. 95) that the partial derivatives f, exist at the origin and 


(“) =o é* 
ay’ /o 43 


‘© Duschek-Mayer, Lehrbuch der Differentialgeometrie, vol. II, Leipzig (1930), pp. 93 
et seq. 


1 L. Bieberbach, Differentialgleichungen, Berlin (1926), pp. 47 et seq., 115 et seq. 








eee 


\ 
{ 
. 
{ 
i 
! 
itd 
Bi 
i] 











4} 
: 

i 
ft 


=—_ 


ES cee 
ESS 


j 
{ 
4 














682 J. H. C. WHITEHEAD 


As Mayer points out, the existence of normal coérdinates for a complete neigh- 
bourhood of g depends on the assumption that f, are continuous throughout a 
region of the form 


ly] <6 (6 > 0). 
To prove this, replace (2.3) by the first order equations 
dx’ ; i: 
es ae = H(z, 8), 


with solutions 


z= x(q, P; 8), g = E(q, P; 8), 


where £(q, p,0) = p'. The continuity of H; implies the continuity of the par- 
tial derivatives —, in p,qands. Evaluating them when s = 0, we find 


(8). 
ap’ s=0 i 


When s > 0, 


& 
Co | mt | et 
Tle sz 


[a+ I E(q, p, t) at | 
a1 [Ea 
§ Jo ap? 


op? t=6s 


the differentiation under the integral sign being permissable since £, is con- 
tinuous in all its arguments and therefore uniformly continuous in 7, relative to s. 


Therefore 
of i of 
£3; = (L) as y— 0. 
This argument completes Mayer’s existence proof for normal coérdinates. 


Let 


1 &’G(y, dy) 
hj; d => — ge 
iY, dy) = 5 adyiady' 
be the components of the metric tensor in normal codrdinates at a point q. If 
ai(y) = hi(0, y), 
I say that 


(2.6) hily, yy’ = ai(y)y’. 





COVERING BY GEODESICS 683 


For the components h,; satisfy relations similar to (2.2) and if 
' 1,.. (dh; hn, ah; 
C;.(y, dy) = =him{| —* oP on ee 
the equations for the geodesics may be written as 


eH 5 01, (y, 2) Mea a 
ds? iE\") 5s) ds ds 





Since y' = pis are geodesics, 
Ci. y, yy'y* = 0, 


and because of (2.2) the relations (2.6) follow from the same formal steps as 
those used in Riemannian geometry.” 
We shall have occasion to use a type of coérdinate system, analogous to 


geodesic polar coérdinates, which are defined as follows. Let 6', --- , 6"~' be 
parameters for a region on the locus given by 
(2.7) A(y) = ayy'y = 1. 


The codrdinates in question, (7, #), are related to the codrdinates y by the 
equations 


y' - p(a', it foe or, 
where p'(@), --- , p"(@), which we require to be analytic in 6, satisfy the equa- 
tion (2.7). From this equation and (2.2) we have 


«oy 
(2.8) aij rr = 0 ; 


and from (2.6) we have 


; oy 
(2.9) hily, yy 36. 0 
and 
ay'* dy’ 
2.10 i —_—— = 
(2.10) his(y, y) oe 
The rank of the matrix 
| api 
ag 











is assumed to be n — 1 and it follows from (2.7) and (2.8) that the vectors 





. oy' ay' 
Ds cattle Sepkd r>0 ’ 
y ’ 06! ? ’ ae" ( ) 





20. Veblen, Invariants of a Quadratic Differential Form, Cambridge Tract no. 24 
(Cambridge, 1933), p. 96. 

















RES... 





684 | J. H. C. WHITEHEAD 


are linearly independent. Therefore the Jacobian of the transformation 
(r, 0) — y does not vanish and it has an inverse which is defined throughout a 
region covered by a pencil of geodesics through the origin. 

Writing 6° for r, the components y;;, of the metric tensor in the codérdinates 6 
satisfy the conditions 


yo0(8, dé, 0, led , 0) = 1 


(2.11) 
vo(0, d°,0,---,0) = 0 (A= 1,---,n—1,d®>0), 


as follows from (2.9) and (2.10). Alsoy~, — 0asr— 0. 


3. The complete space. We now consider an analytic manifold F,, with a 
differential invariant f(P, dP) defined at each point and for each vector dP, 
associated with P. In each allowable coérdinate system f(P, dP) is to deter- 
mine a function F(z, dx) which satisfies the conditions of §2. 

The length of an oriented curve" with Po and P, as its first and last points 
respectively is defined as the value of the integral 


ff(P, aP), 


taken along the curve from Py to P;, the curve being positively parameterized. 
The distance 5(Po, P:) is defined as the greatest lower bound of this integral 
taken along all such curves. In general 6(P, Q) ¥ 6(Q, P), but it is obvious 
that 


6(P, P) = 0, 6(P, Q) + 6(Q, R) 2 4(P, R) 


and that 5(P, Q) is a continuous function" of P and Q, continuity being defined 
in terms of allowable coérdinate systems for neighbourhoods of P and Q. 
In geodesic polar coérdinates of the kind described in §2, 


E(r, 6, 6r, 0, dr, d0) = F(r, 6, dr, dé) — dr 


so long as ér > 0, whether dr = 0 or dr < 0. It follows from the regularity 
condition (2.1) that 


F(r, 6, dr, d0) = dr, 


equality occurring only when dé = 0. Therefore the length of any curve joining 
two points (7, >) and (r;, 6:) near the origin is at least |r: — 7o|. Also the 
oriented extremal 6 = const. joining the origin to any near-by point P, is 
shorter than any other curve joining the origin to P, and the length of any 





18 Any of the curves referred to will consist of a finite number of arcs, each of which 
has a continuously turning tangent. 

14 Cf. Mayer, loc. cit., pp. 79 et seq. 

16 In future a curve joining P to Q is to mean an oriented curve with P as its first point. 





COVERING BY GEODESICS 685 


curve joining the origin to a point which does not lie inside the locus given in 
normal coérdinates by 


(3.1) A(y) = & 
exceeds 6/2. It follows that 
(P,Q) >0 if P#¥Q. 


Let = denote the locus given by (3.1) and let C be the closure of the region 
bounded by 2. Since 6 is the distance from the origin to any point on = the 
latter will be called the sphere of radius 6 having its centre at the origin. I 
have shown elsewhere that a positive p exists such that C is ovaloid provided 
5 <p. That is to say, any point P, in C is joined to any point Q, in C by a 
unique extremal which does not meet 2, except possibly in P or Q. If 5 < p 
it follows from this property and from a theorem due to Marston Morse and 
S. B. Littauer” that C is completely represented in a single normal coérdinate 
system with P as its origin, P being any point inside C. If 6’ < 6/2 it follows 
that any point P, in the region C’, bounded by (3.1) with 6 replaced by 34’, is 
joined to any point Q, in C’ by a unique extremal whose length is 6(P, Q). 

Now assume F’, to be locally compact relative to the metric 6(P, Q). That 
is to say any infinite set of points in a region 


5(0, P) < K (O fixed, P variable), 


has at least one limit point. Then the theorem on the absolute minimum can 
be extended into the large by the classical method" due to Hilbert. The latter 
can be simplified by means of a lemma which is, I believe, well known, but 
which does not seem to have appeared in print.” 

If P and Q are any two points in F,,, there is a point R such that 


(3.2) 6(P, R) = 6(R, Q) = 28(P, Q). 


Let ci, C2, --- be a sequence of curves joining P to Q such that lh, ls, --- — 
6(P, Q), where J, is the length of c.. Let R. be the mid-point of c.. If a@ is 
large enough 


5(P, Ra) < 6(P, Q). 





© Quarterly Journal of Math. (Oxford Series), loc. cit. This, and the two notes on 
which it depends (Quarterly Journal, 3 (1932), 33-42 and 4(1933), 226-7), are worded for 
the reversible case. It is the result stated here that is actually proved. 

‘7 Proc. National Academy of Sciences, 18 (1932), 724-30. 

'S Cf. Hopf and Rinow, loc. cit. 

* It was first pointed out to me by Prof. W. F. Osgood. In the terminology used in the 
topological theory of metric spaces, the lemma states that F, is convex relative to the 
metric 6(P, Q). 





: 
' 














686 J. H. C. WHITEHEAD 


Therefore the sequence R,, Re, --- has at least one limit point, R. For any 
positive « and a suitable value of a depending on e, we have 


6(P, R) < &(P, Ra) + €/2 S dla + €/2 < 34(P,Q) + € 
6(R, Q) < 6(Ra, Q) + €/2 S fla + €/2 < 36(P, Q) + €. 


Therefore 
6(P, R) S 36(P, Q) 


6(R, Q) S 36(P, Q), 
and since 


6(P, R) + 4(R, Q) 2 A(P, Q), 


it follows that R satisfies (3.2). 
For a given value of k it follows from the lemma that there is a sequence of 


points Py = P, Pi, --- , Pm = Q, (m = 2*), such that 
1 
6(Pay Pass) = — (P, Q) (a = 0,1,---,m—1). 


As in the standard proof, it now follows from the corresponding theorem in 
the small and the generalized Heine-Borel theorem, that any point P, is joined 
to a given point Q, by at least one extremal whose length is 6(P, Q). 

It will also be necessary to show that any extremal can be produced indefi- 
nitely.” To state this more precisely we define a complete geodesic as the set 
of all extremals connected with a given one by a finite sequence of extremals, 
consecutive members of which have a segment in common. The theorem is: 

Each complete geodesic is given by a correspondence 


P= P(s), 


defined for all real values of s, where ds, if positive, is the element of arc. 
Let = be an ovaloid sphere, given in arbitrary coérdinates x by 


2Z(x) = 4, 


x(x) being the distance from the centre tox. Let C be the closure of the re- 
gion bounded by 2, q any point inside C and pa unit vector at q (i.e. F(q, p) = 1). 
Let S be the (n — 1)-sphere consisting of all such vectors. Then a continuous 
transformation?! 2 — S is defined by making a given point x, on = correspond 
to the initial vector of the extremal gz. No two points on 2 correspond to the 





20 Hopf and Rinow show that this condition is equivalent to local compactness. 
1 It is well known that an extremal in C varies continuously with its end points. See, 
for example, J. H. C. Whitehead, loc. cit. (1932). 








COVERING BY GEODESICS 687 


same vector and it follows from topological considerations that every vector is 
the image of a point on =. Therefore the geodesic given by 


(3.3) z= x(q, P, 8), 


subject to (2.4), meets 2 for some positive value of s, say s2(q, p). The curve 
(3.3) with its orientation reversed is a solution of (2.3) with H(z, dx) replaced 
by H(z, —dzx) and is an extremal of the integral of F(z, —dx). The region C 
has the same properties of convexity relative to both sets of extremals since 
they are the same curves. Therefore (3.3) meets = when s has a negative 
value s,(q, p), the functions s, and s2 being the two roots of 


Lixi(q, P; 8), lacie x"(q, P; s)} = 6. 


If 6’ < 6 it follows from our earlier remarks that —s,(q, p) and s2(q, p) both 
exceed 6 — 6’ if q lies inside the sphere 2’, of radius 6’ and concentric with 2. 
Now let a geodesic segment be given by 


(3.4) P = P(s), —-x<acs<b< ~, 


This segment is contained in a finite region of F,, and since the latter is locally 
compact the sequence of points P;, Ps, --- , where 


b—a 
r= P(b- E54): 
has at least one limit point P. With the same notation as in the last para- 
graph, let P be the centre of the spheres = and =’. Since P is a limit point of 


the sequence P;, Ps, --- there is a value of k such that P; is contained in >’ 
and also 
b—a 
_- —— — 6’ >b. 
b- 5 vi in oS 


Therefore a segment of the complete geodesic determined by (3.4) is defined 
by (3.3) for values of s exceeding b. Similarly it is defined for values of s 
inferior to a. We should thus arrive at a contradiction in supposing there to 
be a finite upper or lower bound to the values of s for which it is defined. 


4. The transformation y—> P. Let O = (q', --- , g") be any point in F,, and 
let M,, be the tangent space at O. The space M, has a metric given by 
5(E, n) = F(q, § — ”) » 


or, in terms of differentials,”? by 


ds = F(q, dé) . 





_= Spaces of this kind were first studied by H. Minkowski, (Theorie der Konvexen 
Korper, insbesondere Begriindung ihres Oberflichenbegriffs, Gesammelte Abhandlungen, 
Leipzig, 1911, Vol. II) and are often called Minkowski spaces after him (Cf. L. Berwald, 
Congresso Internazionale, loc. cit.). 








a ee 


( 
4 
; 
1 


es 
- ee ics 








688 J. H. C. WHITEHEAD 


The geodesics determined by such a metric are straight lines, the arc being 
measured by an affine parameter. The infinite linear segments (half straight 
lines) issuing from the origin of M,, likewise the geodesic segments issuing 
from O, will be called rays in M, and in F, respectively. 

A normal coérdinate system for a neighbourhood of O defines a transforma- 
tion from points in M, to points in F,, segments of the rays in M,, corresponding 
to segments of the rays in F,. The transformation just described can be 
extended to define a single valued transformation, y — P, of the entire space 
M,, into the whole of F,. For it can be extended indefinitely along each ray 
to define a transformation given by 


(4.1) P = B&y', --- , y") 
with 
(4.2) y' = p's, 


of an entire ray in M, into an entire ray in F,. Letting the parameters p take 
on all values satisfying the equation 


A(p) = 1 


we obtain the desired transformation. 

The transformation y — P is analytic for all values of y with the possible excep- 
tion of the origin. 

If yo corresponds to Po, this means that the transformation y — P, operating 
on a neighbourhood of yo, is given by an analytic transformation 


v= fity', aa y") ’ 
x being a coérdinate system for a neighbourhood of Po. From the definition 
it follows that the points in M,, for which y — P is analytic constitute an open 


set. Moreover it is analytic for points near the origin, other than the origin 
itself. Therefore, assuming the theorem to be false, there will be a point 


at which y — P is not analytic, but such that it is analytic at all points 
(pos, --- , pos), where 0 <s <#. Let P be the image of g in y — P and let 
P be the centre of two concentric ovaloid spheres = and =’, whose radii are 
5 and 6’ respectively, with 6 > 5’. Let (r, @) be geodesic polar coérdinates for a 
pencil of rays containing pp and let (7, %) be the codrdinates of the point 9. 
Since y — P is continuous as y varies along any single ray, there is a value of 
r, say To, between zero and 7 such that the points (r, 0) in M, are carried into 
points inside 2’ provided », S r < 7. Near these points y — P is given by 
(4.3) a = ¢(r, 6), 

the functions ¢ being analytic for 


|o* — @r| < n(r), msr<f, 








COVERING BY GEODESICS 689 


where 7(r) is some positive function of r. Let 7 also satisfy the condition 
7 — To <6- 5’. 
If 
: : , dg' 
GO = #400) and pe) =(*) 
r=Ty 


or 


there is a positive e (€ < n(7o)) such that (6) lies in =’ provided 
| @ _ 00 | < €, 


since ¢'(r, 0), --- , @*(r, 8) are analytic at (ro, %). Also p(6) is a unit vector. 
Comparing (4.3) with (3.3) we have therefore, 


(7, 0) = x‘{q(9), p(O), 7 — ro}, 
and therefore the transformation y — P is also given by 
a = x‘{q(9), p(0), r — 7}. 
Therefore it is analytic throughout the region in M, given by 
|e —Ol<e 
— 8:{9¢(9), p(@)} <r — 1 < se{q(6), p(9)}, 


where s; and s. mean the same as in §3. But s, exceeds 6 — 6’, which in turn 
exceeds 7 — 7. Therefore this region includes the point (7, 6) and we have a 
contradiction in supposing y — P not to be analytic at every point of M, other 
than the origin. 

Notice that no essential use has so far been made of the fact that F, is ana- 
lytic. If k = 3 everything we have said will apply to a differential metric 
space of class k. The differential invariant f(P, dP) is of class k — 1, and the 
transformation y — P of class k — 2 for y ¥ 0. 


5. The conjugate locus. A point in M, at which the Jacobian of y — P 
vanishes will be described as conjugate to the origin, or simply as a conjugate 
point. The set of conjugate points will be called the conjugate locus. The 
remaining points will be described as regular. A conjugate point yo, is carried 
by y > P into a point which is conjugate to O with respect to the image of the 
ray Oyo. From the theory of the J. D. E.,* apart from the analyticity of y — P, 
it follows that the conjugate points on any ray are isolated. 

Marston Morse and §S. B. Littauer™ have shown that the transformation 
y — P fails to be (1.1) in the neighbourhood of a conjugate point. 

Let (ro, 6) be polar codrdinates of any conjugate point and let y — P be 





* We shall use J. D. E. to stand for ‘Jacobi differential equations.’ 
*4 Loe. cit. 











oh? 

a | 
ae 
a 


ewe 


Sa 








Pee ORES 


: 


690 J. H. C. WHITEHEAD 


given by equations of the form (4.3). According to a theorem by Marston 
Morse,” if n — k is the rank of the matrix 








| ax || ;' . 
=| (j =0,1,---,2—1,0 =), 
evaluated for 6 = 4, the Jacobian 
(xt, --- , 2") 
A(r, 6) — a(r, op ; gr) ’ 


evaluated for 6* = 6) has a zero of the k* order for r = ro. The number k is 
called the order of the conjugate point. 

Since the conjugate points in the neighbourhood of a given one are those 
points in M,, and only those, which satisfy the analytic equation 


(5.1) A(r, 6) = 0 


the conjugate locus is what 8. Lefschetz and the present author have called 
an analytic structure.” It can therefore be covered by an analytic simplicial 
complex, in general infinite, which we shall call the conjugate complez. 

Near any conjugate point (ro, 9) the locus of conjugate points is given by an 
equation of the form” 


(5.2) r* + By(O)ré* + --- + Br(O) = {r — w1(9)} --- tr — wn(O)} = 0, 
where the coefficients B(6', --- , @"~') are analytic near 4 and 
(Oo) MH er - = w(o) = 10. 


The coefficients B are real for real values of @ due to the fact that those of the 
roots w(6@) which are not real for real values of @ occur in complex conjugate 
pairs. We shall use a powerful method due to Marston Morse (Morse I) to 
show that the functions (0), --- , wx(0) are all real for real values of 6. Among 
other things, this means that the conjugate locus is everywhere (n — 1)-dimen- 
stonal and no ray touches it. 

Let Oy be any ray in M,, and let S,, Ss, --- , Sm be concentric spheres in 
M,, with their centres at the origin. Let, --- , 7m (Ta < Tesi) be their radii. 
Let (¢2, --- , {2 *) be analytic parameters for a neighbourhood of S., which we 
call o., containing the point where Oy meets it. If the point ¢. is carried by 
y — P into P., we assume rayi — 12 to be small enough, and oq to be so re- 
stricted, that P. is joined to P.,: by an elementary extremal, ¢. and ta4: being 





** Trans. American Math. Soc., 31 (1929), p. 385 (this paper will be referred to as 
‘Morse I’), and Proc. National Academy of Sciences, 5 (1931) 319-20. See also §9 below. 

*6 Trans. American Math. Soc., 35 (1933), 510-17. See also S. Lefschetz, Topology, New 
York (1930), chap. VIII, and B. O. Koopman and A. B. Brown, Trans. American Math. 
Soc., 34 (1932), 231-51. 

27 W. F. Osgood, Lehrbuch der Funktionentheorie, Berlin (1929), vol. II, pp. 83 et seq. 





COVERING BY GEODESICS 691 


arbitrary points in o. and oa41. Then 6(Pa, Pas) is an analytic function of t. 
and ta+1, and the length of the broken extremal 


OP,P, --+ Pm 
is an analytic function, 

J(ul, --- , wu), N = m(n — 1), 
of 

(ut, ---,u%) = (ti, ---, tf, ---, te, --- ee). 
Let 
Oulu) = 5 oe 

and let v', --- , v” be the values of u!, --- , wY determined by a ray which meets 
on in a point, ym, Which is arbitrary except that Oy,, shall meet o1, --- , m1. 


Further let om contain no conjugate point. Then, according to Morse I, the 
quadratic form 


(5.3) Qyu(v) EE 


is non-degenerate and its type number” is the sum of the orders of the conju- 
gate points on the segment Oy». Since (5.3) is non-degenerate, and since the 
coefficients are continuous in v, the type number does not vary with small 
variations of Ym. 

Now let (79, 4) be polar coérdinates of any conjugate point on the segment 
Oym, and let the conjugate locus be given locally by (5.2). In the neighbour- 
hood of (7, %) the variable ray Oy» meets the conjugate locus in at most k 
points, a conjugate point of order p being counted p times. Therefore a defi- 
ciency near one of the conjugate points on Oy,, cannot be compensated for by 
an excess over the corresponding k near another. Since >) k (summed for all the 
conjugate points on the segment Oy,,) is constant, it follows that Oy, meets the 
conjugate locus in at least k points near (ro, 4). Therefore (0), --- , wx(6) are 
all real near . For if w,(6,) was not real for some value of \ the ray 6; would 
meet the conjugate locus in less than k points near (7, 4). 


6. By means of the transformation y — P a quadratic tensor” is defined 
which has a unique set of components at each point of M,. For let y be any 
point in M,, P its image in y — P, and let x be a codrdinate system for some 





*® The type number is the number of negative terms in a normal form. 
*® This tensor has a different transformation law from the ordinary tensor-function of 
position, 








NPS EER Maes Spin A 








692 J. H. C. WHITEHEAD 


neighbourhood of P. Let ¢ be a tangent vector at x to the ray corresponding 
to Oy. Then the components of the tensor referred to are 


d2* ax! 
(6.1) holy) = gii(2, é) ay? ay? : 


The components, h, are thus defined for each value of y and are analytic every- 
where except at the origin where they are, in general, many valued. 
With the notation of §2 


Rogly) = Aoaly, y) 
for values of y near the origin. Therefore 
(6.2) hygy? = Ap y? 
near the origin and hence for all values of y. From (6.1) we have 
(6.3) Rpg dy? dy* = 0. 
From (6.1) it also follows that the matrices 


ox' 


[hoel| and | 














have the same rank. Therefore the rank of the former is n — k at a conju- 
gate point of order k. Also 


ax | 


(6.4) h= | Apo | = 7 ay 


? 








and therefore +~/h is analytic for all non-zero values of y. Moreover the conju- 
gate locus is given as a whole by the single equation 


(6.5) Vh = 0. 


Let yi (¢,7 = 0,1, --- ,m — 1, ® = r) be the components of this tensor in 
polar codrdinates. As in §2 we have 


(a) | ve =] 
(b) | yo = 0. 


From equations analogous to (6.4) and from a theorem quoted in §5 it follows 
that ~/7 has a zero of the kt" order in r at a conjugate point of order k. If 
y — P is given by 


(6.6) . 


(6.7) x = ¢i(r, 8) 
the equations (6.6b) express the fact that the vectors a in F, are transversal 
to aa We deduce the useful property that 








COVERING BY GEODESICS 693 


a¢g' a¢' 
(6.8) = dr + ap de = 0 
implies 
7 89" gn — 
(6.9) dr = 0, ap OM = 0. 


This and the last section give us a number of necessary conditions for an 
incomplete space F4, satisfying the conditions of §2, to be equivalent to part of 
a complete space.” One condition is that the functions 


defined as in §2, must be capable of analytic continuation into functions h,,, 
which are analytic for all real non-zero values* of y. These functions and the 
locus (6.5) must satisfy the conditions given in this and in the preceding section. 
From the results of the following sections it will be obvious how to obtain 
other necessary conditions. 

In the Riemannian case at least, the problem of obtaining sufficient condi- 
tions may be formulated as follows. Let a Riemannian space be defined by 
the metric 


(6.10) dt? = hy, dy? dy?, 


where the functions h,, are analytic for all values of y, andh #0. This space 
may be singular in the sense that h may vanish at certain points. Under what 
conditions can one construct a non-singular Riemannian space F,,, which is re- 
lated to (6.10) in the way described in this and the preceding sections? 


7. The solutions of the J.D.E. Using polar coérdinates for M,, let the 
transformation y — P be given by 
(7.1) x = g(r, 8) 


hear any point (ro, 9). For a given value of 6 the vector in F, whose com- 
ponents are 


ag! ag” 
(7.2) oe UX(6), «++, SE WG) 
is a solution of the J.D.E., u!, --- , uw"! being independent of r. A solution 


of the J.D.E. defined over any segment of a ray in F,, determines a solution 
along the whole ray. Therefore there is a sense in saying that a solution defined 
over any segment vanishes at O. In this sense (7.2) is the general solution 
which vanishes at O. This is the image in F,, of the vector (0, u!, --- , u”~') 
associated with points of the ray @in M,. Thus any vector field in M, whose 


Cf. Hopf and Rinow, loc. cit., §1. 
* This is similar to a condition found by Rinow, Dissertation, Berlin, 1932, §3. 








et 
HY 














694 J. H. C. WHITEHEAD 


r-component is zero and whose other components do not depend on 7, is carried 
by y — P into a family of vectors consisting of solutions of the J.D.E. set up 
for a variable ray. 

At a conjugate point the vector 


aq! ag” 
apx 6%» 55° uw 





vanishes for at least one non-zero set of values u!, --- , u*—!, which are the 
last n — 1 components of a vector (0, uw, --- , u*") in M,. Using the nota- 
tion of §6 it follows that the vectors in M, which are carried by y — P into zero 
vectors are also given by 


ue = 0, VYru u= 0 
or in the coérdinates y, by 


hin? = 0. 


8. The infinitesimal geometry of the conjugate locus. A point on the con- 
jugate locus will be described as ordinary if its neighbourhood is an (n — 1)-cell, 
and as a branch point otherwise. Thus ordinary points are those in whose 
neighbourhood the conjugate locus is given by (5.2) with k = lorw: = --- = w, 
if k > 1. Im either case the neighbourhood is given by a single analytic equa- 
tion of the form ‘ 


(8.1) r = (6). 


Moreover the ordinary points near a branch point lie on (n — 1)-cells each of 
which is given by such an equation. It follows that in a suitable subdivision 
of the conjugate complex each (n — 1)-cell is given by an equation of the form 
(8.1) as @ varies over an (n — 1)-cell on the unit sphere. The orders of any 
two points in such an (n — 1)-cell are obviously the same, and this will be 
called the order of the cell. 

Now let on; be an (n — 1)-cell of order k, and first suppose kK > 1. Ina 
neighbourhood of ¢,_; let y — P be given by (7.1), and let 


ica) — | ¢i(r, | 
(6) — | #8 ees’ 


o,-1 being given by (8.1). Since (6.8) implies (6.9) it follows that the matrix 
II @x || 

is of rank n — k — 1, and the vectors associated with points of o,-1 which are 

carried into null vectors are given by the Pfaffian system*® of equations 


(8.2) ¢;} do = 0. 





2 These equations are equivalent to y,,d0" = 0, and in the presence of the tensor hp 
can therefore be set up without reference to the space F, (Cf. the concluding remarks of §6). 





COVERING BY GEODESICS 695 


If 
¥'(0) = g{w(), 6} 
we have 
si Bote, (6h = Hy = 8) 
and it follows that the equations 
(oi de* = 0 
(8.3) auieai® 


are completely integrable. 
If k > 1 and tf the equations (8.2) are completely integrable they are equivalent 


to (8.3). 
For if (8.2), which may be written 


dy oa } dw = 0, 
are completely integrable, the equations 
(8.4) do: bw — 5p: dw = 0 


are satisfied for all values of d@ and 66 which satisfy* (8.2). But since k > 1 
there is a non-zero solution, d@, such that dw = 0, while if (8.2) are not equiva- 
lent to (8.3) there is another solution, 56, such that dw ~ 0. For these values 
of dé and 66 it follows from (8.4) that 





- a? g* 
8.5 \ = 
(8.5) ar ag* ” ° 
when r = w(6). But the vector 
agi ; 
a4 do» (6 fixed, r variable) 


satisfies the J.D.E. and vanishes for r = w(@), but not identically. Hence (8.5) 
is absurd and the theorem follows. 

If k = 1 the equations (8.2) are linearly dependent upon n — 2 of them and 
these, being independent, are completely integrable like any other system of 
(n — 2) independent Pfaffian equations in (n — 1) variables. 

Let 


(8.6) = A(t) 

be an integral curve of (8.2), with o = 6 => 0. Its image in y — P is given 
by 

(8.7) xi = ¢i[w{a(t)}, a). 


LL 
* E. Goursat, Legons sur le Probléme de Pfaff (Paris, 1922), §§65, 66. 











ae 
fe 
1 





696 J. H. C. WHITEHEAD 


The components of the tangent vector of this curve are 


i dw ; de a ale 

or a +. Ge = 9,0. 

If  ¥ O it follows that the curve (8.7) envelopes the 1-parameter family of rays 
whose initial directions are given by (8.6). The element of length of the curve 
(8.7) (dt > 0) is given by 


ds? = gi(b, $,)o; 97 du 
= dw. 
Hence 
(8.8) S(to, t) = w(t) — w(to) (to < 2), 


where s(t, ¢) is the length of the segment (¢o, t) of the curve (8.7).% Ifa = 0, 
that is to say if the curve (8.6) is an integral of (8.3), the curve (8.7) degen- 
erates into a point. Therefore an (n — 1)-cell for which the equations (8.2) 
and (8.3) are equivalent will be called degenerate. 

If o,-1 is degenerate each integral manifold of (8.2) lies in the intersection 
of on-1 With a sphere r = const. Each of these manifolds, 


{r = const. 


(8.9) 
OP = O (ul, ah u®), 


is carried by y — P into a single point which is a meeting point of the rays in F, 
whose initial directions are given by (8.9). The cell o,_; is carried into an 
(n — k — 1)-dimensional subspace of F,, each point of which is the meeting 
point of «* rays. In each of these families the extremals joining O to the 
meeting point have the same length, namely the constant value of r in (8.9). 

Conversely, the members of any continuous family of extremals joining two 
points have the same length.* For one of the common end points may be 
taken as our point O, and the other is the image of an analytic locus in M, 
which is given locally by equations of the form 


¢'(r, 6) = x) . 
The extremals of the family correspond to linear segments joining the origin 
of M, to points on this locus. The latter is composed of analytic cells, a 
tangent vector to any one of which satisfies the equations 
¢: dr + ¢} do = 0. 


Therefore dr = 0 according to (6.8) and (6.9). 
In the non-degenerate case (k = 1) we can take as a complete set of alge- 





34 Cf. Mason and Bliss, loc. cit., p. 453. 


6 Marston Morse arrives at this result from a different set of ideas (Zurich, loc. cit., 
p. 178). 








COVERING BY GEODESICS 697 


braic solutions of (8.2) a complete set of solutions of (8.3) and a vector u(@) 
which satisfies (8.2) but not (8.3). Through each point of o,_; there is an 
integral manifold of (8.3), Ex-1, which is carried by y — P into a single point Pp. 
The vectors u(6) associated with the various points of E,_; are carried into 
»*-! vectors associated with Py. The components of the latter are 

$; w, u* 
and so they are tangent to the rays through Po. Except at points where w, = 0 
(see §9 below) at most two of these vectors have a given direction. Otherwise 
three distinct geodesics would have a common tangent direction at Py. There- 
fore we have, in the non-degenerate case: 

Near Po the conjugate locus in F,, is an (n — k)-dimensional envelope of the 
rays. Each point is the vertex of a k-dimensional® cone of rays, the tangents to 
which generate a k-dimensional cone in the (n — k)-dimensional tangent space of 
the conjugate locus. 

Notice thatk S n — k. That is to say the order of a non-degenerate (n — 1)- 
cell on the conjugate locus cannot exceed n/2. 


9, Approximate equations of y—> P. We shall need the lemma: 

Let G(x) = gi;x'x! be a positive definite quadratic form and let a and b be ma- 
trices which satisfy the conditions 

9i;9%51 = 9,544, 
or in matrix notation™ 
(9.1) a*gb = b*ga. 
If the determinant of the matrix ad + by is not identically zero, \ and yw being 
complex variables, its elementary divisors are simple. 

There is no loss of generality in assuming the matrix b to be non-singular. 
For if its determinant vanishes we may replace a\ + by by a’d’ + b’u’, where 
\’ and uv’ are linearly related to \ and yu and b’ is non-singular. If one of these 
pencils of matrices has simple elementary divisors the same will be true of 
the other. 

So, assuming 6 to be non-singular, there are matrices p and q such that pbq 
is the unit matrix, which we denote by 1. If a = pag and y = p*—' gp the 
relation (9.1) implies a*y = ya. Now apply a transformation of the form 
a = hahwhere h is a matrix such that h* yh = 1, that is to say h is the matrix 
of a collineation which reduces our quadratic form to the sum of squares. 
Since a*y = ya it follows that as = ao, or that ao is symmetric. The lemma 
now follows from a theorem on pairs of quadratic forms,® applied to the ma- 
trices ap and 1. 





** That is, a cone with a (k — 1)-dimensional section. 

If wis any matrix u* is the transposed matrix. Remember that (uv)* = v*u* and 
(u1)* = (u*)-1 when the latter exists. 

** M. Bécher, Introduction to higher algebra (New York, 1927), p. 305. 














Be ee ae eS 


Sb a at 





698 J. H. C. WHITEHEAD 


An argument used by Marston Morse*® shows that the condition det 
(a\ + bu) = 0 together with (9.1) implies the existence of a non-zero vector £, 
such that 


(9.2) ait=bit=0. 


Now let (70, 0) be geodesic polar coérdinates of any point in M,, and let P, 
be its image in y— P. Let (2, y’, --- , y**) be coérdinates for a neighbour- 
hood of Po having Po as the origin and a segment of the ray 6 = 0 as z-axis. 
Also let x measure the distance along this ray, the positive direction being that 
in which the length of the extremal OP increases. Then the neighbouring 
rays are given by equations of the form 


(9.3) y=(a\+ bdr) m4.--. 
From the theory of the J.D.E. it follows that” 
Ras (afb — atbf) =0 (a,B=1,---,n—1), 


where R,gu*u® is a certain positive definite quadratic form. Also there is no 
non-zero vector ~ such that 


atu = bx = 0. 


For such a vector would imply the existence of a non-zero solution of the J.D.E. 
which vanished with its derivative when x = 0. Therefore the matrix pencil 
a + bz is non-singular and has simple elementary divisors, according to our 
lemma. Therefore the equation (9.3) can be reduced to the form 


(9.4) y = (p+ mr) + --- (not summed) 


by linear transformations on @ and y, and p, and q do not both vanish for any 
value of \ (ef. Morse I, pp. 385-6). 
Let 


r= w(a, teeny se B) 


be the equation of a non-degenerate (n — 1)-cell on the conjugate locus in M., 
where the coérdinates @ have been transformed to coérdinates (a, 8), in which 
the curves a’ = const. represent a given set of solutions of (8.2) but not of 
(8.3), and a’ = 8 = 0 represents the same ray as 6 = 0. The coordinates y 
may then be linearly transformed to codrdinates j', --- , 9"~, z in which the 
equations (9.4) have the form 


= (Peo +42) a7 +--- 
z=Br+.-.-.-., 


(9.5) 





8° Proc. Nat. Academy of Sciences, 17 (1931), 319-20. 
*° Marston Morse, Math. Annalen, 103 (1930), 52-69. 








COVERING BY GEODESICS 


t If 

. <= a(r, a, B) 
y° - 7 (r, Qa, B) 
gs = z(r, Qa, B) 


are the equations of y — P let 


¥(a, B) = r{w(a, B), a, B} . 


¥(0, 8) = kam + kp + ..., 


We assume that w(0, 8) is not independent of 8, so that 


699 


with m >-0,k #0. The envelope of the 1-parameter family of rays, a’ = 0, is 


given by (9.5) with a” = 0 and ¥(0, 8) substituted for z. 


When a’ = 0 let 
2 = Bc + of +--., 


omitting terms involving 6“x’ with v > 1, or u > 1 if v 


From the identity 
0z 
a = 0 
(%) 


kp + cle’ + ee + ABu-ltme + en 


we have 


with uw — 1 + mw exceeding either / — 1 or m. 


l=m+1, c 
and when (0, 8) is substituted for x we have 


mk 


2 = ex ; oes + prt? B(B). 


Similarly 
hd -_ B™*?B(B) 


and the equation of the envelope" is 


fz = ken + ptt A(B) 
(9.6) oat that a 
mk im ™m 
® ogee tae mw, 





A(8), B°(8) and B(8) being analytic when 6 = 0. 





“ Cf. The article by Lindberg referred to in §1 of this paper. 


—k/m +1, 


loru >lifv=0. 


0, 


It follows that 














“en 
i ee 





700 J. H. C. WHITEHEAD 


When ™ is even the envelope has a cusp at the origin of coérdinates and the 
tangent is the z-axis. The cusp points in the positive direction of the 
z-axis if k < 0, and in the negative direction if k > 0. Since j!, --- , 9" are 
of order m + 2 in @ the situation is approximately two-dimensional and the 
relation between the rays and their envelope near a cusp is analogous to that 
described by Lindberg in the article referred to above. 

The two kinds of cusp occur when, and only when, B = 0 is a maximum (k < 0) 
or a minimum (k > 0) of the curve on the conjugate locus in M,, given by 


r= (0,8), af = 0. 
This will be the case if 

w(0, 8) = kom + ---, 
omitting terms of higher order than m. Let 

w(0, 8) = KB + ++, 
and let 

z(r, 0,8) =r+ap4+..-., 

the remaining terms involving 6?r? with p > X if g = 0. From the identity 


Ox . 
(¥) Wit 


ap! +... + App-ituwa 4... = 0 


with p —-1 >A — 1if g = 0, and p — 1 = O in any case. It follows that 
\ > nif a ¥ 0, and therefore 


ke" + --- = ¥(0, B) 
= 0(0, 6) + ap + --- 
= KB¥ + .---, 


Hence » = m,x = k and the theorem follows. 


we have 


10. The cut locus. If (p', --- , p") is a variable point on the sphere 


Aty) = 1 
in M,, the rays in F,, may be represented by an equation 
(10.1) P = P(r, p), 


r being the length of the segment OP. If 
r > 6{0, P(r, p)} 








COVERING BY GEODESICS 701 


for a particular value of p and r = 7, this inequality obviously holds for all 
values of r exceeding ro. Moreover 


(10.2) r = 6{0, P(r, p)} 


for all (positive) values of r less than a positive constant. For a particular 
value of p the set of values of r for which (10.2) holds is closed, since the dis- 
tance function is continuous. Therefore (10.2) either holds for all values of r 
or there is a greatest value of r, which we denote by p(p), for which it is true. 

Either the point {p(p), p} in F, is conjugate to O or else O is joined to it by 
two extremals each having the same length.” For let 7, re, --- be a decreasing 
sequence of numbers converging to p(p). There is at least one extremal, ca, 
joining O to P(ra, p) whose length is less than r,. The sequence of extremals 
¢, Co, «++ either converges to the extremal given by (10.1) or it has some other 
limiting extremal @. In the first case the point {p(p), p} is conjugate to O. 
In the second case the length of ¢ does not exceed p(p), since the length of c. 
is less than rg. It is not less than p(p) by the definition of p(p), and it fol- 
lows that p(p) is the length of @. 

It will be convenient to adopt the convention that p(p) has the value ‘infinity’ 
when (10.2) holds for all values of r. With this convention the function p(p) 
is defined for all values of p. 

On a ray p there is no conjugate point, (r, p), with r < p(p). For no ex- 
tremal can furnish a minimum if it contains a point which is conjugate to one 
of its end points. 

The function p(p) ts continuous. That is to say, if pi, pe, --- — p, then p(pr), 
p(p2), --- —> p(p), whether p(p) is finite or infinite. 

If this is false there will be a point j on the unit sphere in M, and a sequence 
Pi, Po, +++ converging to p, such that the sequence pu, pe, --- (pa = p(Pa)) has 
a limit & (finite or infinite) other than p(p). Replacing it by a subsequence if 
necessary, we may suppose that pi, pe, --- converges tok. Since 


pa = 5{0, P(pa; Da)} 
and since, by the continuity of y — P, 
P(p:, pi), P(p2, pz), «++ — Plk, B), 
it follows from the continuity of the distance function that 
k = 6{0O, P(k, p)}. 


Therefore k < p(p). Therefore the point (p'k, --- , p"k) in M, is not a conju- 
gate point. Therefore only a finite number of (pa, pa) are conjugate points, 
and for large values of a it follows that (p.2, Pa) has the same image in y — P 
‘48 some other point (pa, Qa). Moreover the sequence (1, 91), (p2, 2), --- does 
not have (k, p) as a limit point since y — P is (1.1) near (k, p), the latter 
being a regular point. Therefore it has a limit point (k, 9) other than (k, #) 





“ Both possibilities may occur at once. 

















702 J. H. C. WHITEHEAD 


and these two points have the same image in y — P, since the latter is con- 
tinuous. There are therefore two extremals in F,, namely p and @, each of 
which minimizes the distance from O to P(k, p) = P(k, G). Then the curve 
given by 

P=P (s, 9) 0 s 


= P(s, p) k <8 & p(p) 


minimizes the distance from O to P(p(p), p). But this is absurd, since the 
curve in question has a corner, and it follows that p(p) is continuous. 

The function p(p) is finite for each value of p if, and only if, F, is compact. 
For if F,, is compact there is a finite upper bound to the distance of a variable 
point from O. On the contrary, if F, is not compact there is a sequence of 
points whose distance from O tends to infinity. Such a sequence will deter- 
mine a sequence of rays pi, po, --- such that p(pi), p(pe), --- > ©. Ifpisany 
limit point of this sequence, it follows from the continuity of p(p) that p(p) is 
infinite. 

Since p(p) is continuous, the set of points in M, whose coédrdinates (r, p) 
satisfy the relation 


lA 


k 


IA A 


0<r < p(p) 


is a region which we denote by ©. No point, (ri, pi), in © has the same image 
in y — P as another point, (r2, po), in E or on F(€), the boundary of €. For if 
there were two such points we should have r; < p(p:), r2 S p(pe2), and it would 
follow from the definition of p(p) that r1 = re. There would then be two 
minimizing extremals in F, joining O to P(n, pi), which we have shown not 
to be the case. 

Every point in € is regular and therefore the image of € in y — P is a region, 
&, Which is in (1 — 1) correspondence with ©. From the continuity of y — P 
and the last paragraph it follows that the closure and boundary of € are carried 
by y — P into the closure and boundary of § respectively. Since O is joined 
to each point in F,, by at least one minimizing extremal it follows that the 
closure of § is the entire space F,. The region ©, and therefore §, is obviously 
an n-cell, and § is represented as a whole in the normal codrdinates. So we 
have the theorem: 

The space F,, is the closure of an n-cell* which is represented as a whole in a 
single normal codrdinate system. 

The boundary of § we shall call a cué-locus, since it is of less than n dimen- 
sions and its residual space in F, is an n-cell. 

If € = M, the boundary of € does not exist. Otherwise the intersection of 
a given ray with F(@) is the first point which is either a conjugate point or 
which satisfies the conditions 


(a) |fity) = gi(z) 


(10.3) 





* Cf. O. Veblen, Analysis Situs (New York, 1931), pp. 155 et seq. 


COVERING BY GEODESICS 703 


y and z being distinct and y — P being given by 
(a) [at = fly) 
(b) (x = g(z) 


near points yo and 2 respectively. From (8.8) and the final theorem in §9 it 
follows that the only non-degenerate conjugate points which can lie on F(@) 
are points at which a curve corresponding to an envelope of rays has a minimum. 
The corresponding point in F, is a cusp of the envelope.“ Any (n — 1)-cell 
common to F(€) and the conjugate locus is necessarily degenerate. Therefore 
the intersection of F(€) with the conjugate locus is carried by y > P into a locus 
of less than (n — 1) dimensions. The ‘general’ point on F(@) is a regular 
point zo, related by (10.3) to just one other regular point yo. In the neighbour- 
hood of such a point (10.3a) can be solved for z in terms of y and for y in terms 
of z and the respective neighbourhoods of yo and zo are analytic (n — 1)-cells 
given by (10.3b). These (n — 1)-cells are carried by y — P into (n — 1)-cells 
on the boundary of §. Regular points which are related by (10.3) to more 
than one other point, or to a conjugate point, lie on a locus of less than (n — 1) 
dimensions. 

Conversely, any (n — 1)-cell on the boundary of § has two sides, and it is 
easy to see that each ‘side’ is homeomorphic to an (n — 1)-cell on F(G). 

If n = 2 and the metric of F,, is Riemannian, any 1-cell of the cut-locus in F, 
bisects the angle between the two rays meeting at any point on it. This 
theorem can be generalized as follows. Let yo and z be any two regular points 
in M, which satisfy (10.3). The transformation y — P is given by (10.4a) 
near yo and by (10.4b) near zo, and the tangents to the rays through x have 
components 


(10.4) 


t-2p and f= 0 
respectively. The locus in F, determined by (10.3) is given locally by 

@(x) = Afy(x)} — Afa(z)} = 0. 
If gis(z, dz) are the components of the metric tensor in the codrdinates 2, the 
theorem in question states that 


. Lae 
(10.5) gii(2, ny — giz, OL = 3 dri’ 
To prove this we refer to §§2 and 6 for the formulae 
(a) [A(y) = ap (y)y?y? 
dx? Ox! 
(10.6) (b) <holy) = giz, n) ay? ay? 


(c) hoa(y)y? = Apq(y)y’. 
ie eT 
“Cf. Poincaré, loc. cit., p. 244. 
“ Poincaré, loc. cit. 





¥: 
2 
ti 
. | 
: 
: 











i 


fi 


| 





704 J. H. C. WHITEHEAD 
From (10.6b) and (10.6c) we have 
oz! 
gi(x, n)1 = giz, n) ay? y? 


oy? 


os y 
= hnly) 55 y" 





ay? 
= Ang (y) ae y?. 


Similarly 
azP 
gilX, O)F? = Apg(z) oa zg, 


Remembering (2.2), we have 


1a® 
2 axi 


i 


dy? dz? 
pel) S27 Y® — Apg(e) 28 


gi(X, n)n? =a g(x, ere ? 


which is the required relation. 


Added in proof (June 15, 1935): Marston Morse’s Colloquium lectures (The 
calculas of variations in the large, New York, 1934) have recently been pub- 
lished, also two papers by S. B. Myers (Duke Math. Journal, 1, 1935, 39-49, 
and Proc. Nat. Academy of Sciences, 21, 1935, 225-7), which are closely re- 
lated to this one. 


BaLuioL, COLLEGE, OxForD. 








ANNALS OF MATHEMATICS 
Vol. 36, No. 3, July, 1935 


SUR LA DEFINITION AXIOMATIQUE D’UNE CLASSE D’ESPACES 
VECTORIELS DISTANCIES APPLICABLES VECTORIELLEMENT 
SUR L’ESPACE DE HILBERT 


Par Maovrice FR&cHET 


(Received August 18, 1934) 


Introduction. M. von Neumann a défini d’une fagon axiomatique une im- 
portante classe d’espaces distanciés' applicables sur l’espace de Hilbert. Il 
introduit, 4 cet effet, une généralisation du produit scalaire de deux éléments 
et déduit de ce produit la définition de la distance. 

Il y aa ce sujet deux observations 4 faire. 

I. Tout d’abord le produit scalaire ((f, g)) tel qu’il est défini par M. von 
Neumann est en général un nombre complexe et c’est une fonction dissymé- 
trique de fet deg. On est ainsi conduit 4 des difficultés quand on veut étendre 
4 espace complexe de Hilbert certaines notions géométriques simples, comme 
l’orthogonalité et l’angle de deux directions. 

On évite celles-ci en considérant le produit scalaire dissymétrique ((f, g)) de 
von Neumann comme une notion secondaire se déduisant aisément d’un produit 
scalaire symétrique (f, g) jouant le réle fondamental. Ce dernier, symétrique 
et toujours réel, méme quand f, g sont, par exemple, des fonctions complexes, 
se préte beaucoup mieux 4 la généralisation des propriétés géométriques eucli- 
diennes. 

II. Mais il est possible d’aller plus loin. La notion de produit scalaire 
s’étendrait peu utilement 4 certains espaces pour lesquels se définit une ‘‘dis- 
tance” de facon naturelle et simple. Au lieu de définir accessoirement la dis- 
tance 4 partir du produit scalaire comme le fait M. von Neumann, il est évidem- 
ment préférable, si c’est possible, de suivre plus étroitement |l’analogie avec 
l’espace euclidien en déduisant la définition du produit scalaire de celle de la 
distance considérée comme notion fondamentale. Or, cela est possible, comme 
il sera expliqué plus loin. 

Ainsi se trouvera mise complétement en lumiére la dépendance certaine, mais 
auparavant moins en évidence, des conceptions qui sont a la base de la défini- 
tion de M. von Neumann: I. de celles qui nous avaient conduit dés 1905 a la 
notion des espaces distanciés et II. de leur trés intéressante application par M. 
M. Banach et Wiener & la notion d’espace vectoriel distancié. Dans le présent 
mémoire, nous avons transformé graduellement les définitions de M. von Neu- 





‘Voir pour la définition de ce mot: Les Espaces Abstraits, par Maurice Fréchet, 1928, 
chez Gauthier-Villars, Paris, 4 la page 61. Les autres renvois 4 cet ouvrage seront 
distingués par la notation E. A. M. Flamant a réduit le systéme des axiomes définissant 
un espace vectoriel distancié 4 un systéme d’axiomes indépendants. 


705 











ae ei Te: eb ae 
ds masts opp 


706 MAURICE FRECHET 


mann pour montrer directement leur équivalence avec la définition finale basée 
sur la notion de distance. 

Dans un autre mémoire?, nous montrerons directement comment cette défini- 
tion finale exprime que l’espace considéré est ‘‘vectoriellement applicable” sur 
l’espace “‘concret’”’ de Hilbert. 


Rappel de la définition d’un espace de von Neumann. La définition due 
& von Neumann, ayant son origine non seulement dans la définition de l’espace 
concret de Hilbert, mais dans celle des espaces abstraits distanciés et dans celle 
des espaces abstraits vectoriels de M. M. Banach et Wiener, il nous parait 
plus indiqué de désigner l’espace congu par von Neumann, non par le nom 
d’une seule des trois notions qui y sont confondues mais tout simplement par 
le nom de celui qui les a rapprochées. 

Rappelons ici la définition axiomatique*® de l’espace de von Neumann. 

On considére une classe d’éléments abstraits f, g, --- , sur laquelle a été 
donnée d’avance une définition de la convergence et de la limite d’une suite 
quelconque d’éléments fi, fo, --- , fny ++ > 

L’espace abstrait ainsi défini sera appelé espace de von Neumann s’il vérifie 
les cing postulats suivants. On suppose d’abord que l’on peut y associer deux 
opérations représentées par les symboles + et- et soumises aux conditions 
suivantes: 

Postutat I: Les éléments de |’espace forment un champ de vecteurs: 

1. Sif, g sont deux éléments de l’espace, f + g aussi. 
(A) 42. f + g) est identique Ag + f, ouencoref+g=g9+f. 
3. f+g +h=f+gt+h). 


| 4. Sic est un nombre complexe, c-f appartient 4 l’espace. 
5. c-(c1-f) = ce:-f quels que soient les nombres complexes ¢, ¢1. 
(B) 46. (c+ a)f=cef+arf. 
R ef+g=cfte-g. 
8. 1-f =f. 
) 4 I] existe un élément 0 de l’espace tel que 


{+0=f; c-0=0; 0-f=0. 


On suppose maintenant qu’on peut associer 4 l’espace une opération (( , )) 
appelée produit scalaire (dissymétrique) jouissant des propriétés suivantes: 
PostuuaT II: ((f, g)) est un nombre complexe tel que: 
a) ((cf, 9)) = e((f, 9) 
B) (If + gl, h)) = (Ff, h)) + (G, h)) quel que soit 1’élément h de l’espace. 
7) (9, f)) = ((f, g))* en désignant en général par c* le nombre complexe 
conjugué d’un nombre complexe c. 








* Revista Mat. Hispano-Amer., t. IX, 1934, 193-201. 
3 N’ayant pas actuellement sous la main la définition textuelle originale, nous nous 
contenterons de donner ici une définition substantiellement équivalente. 








ESPACES APPLICABLES SUR L’ESPACE DE HILBERT 707 


5) ((f, f)) est un nombre réel 2 0. 

«) ((f, f)) = Osi f = 0 et dans ce cas seulement. 

Dérmitions. On appelle souvent la racine carrée (= 0) de ((f, f)) la norme 
de f et on écrit souvent || f ||? = ((f, f)). On peut poser f — g = f + (—l1)g 
et on peut poser fg = || f — g||. On démontre avec von Neumann que 


P) fo= of 29, 

Q) fg = Osif = g et seulement dans ce cas, 

R) fg S fh + hg. 

Postutat III. La condition nécessaire et suffisante pour que f, converge 
vers f quand n — @ est que (If — fal, (f — fnl)) — 0. 

S) Il est équivalent de dire que ff, doit tendre vers zéro. Alors en vertu 
des quatre conditions P), Q), R), 8) l’espace abstrait considéré est un 
espace distancié, ou fg est la distance entre f et g. En vertu des postu- 
lats I, c’est un espace vectoriel distancié (E. A., p. 140). 

PostuLtaT IV. L’espace considéré a un nombre infini de dimensions. Nous 
nous contentons ici d’attacher a ce postulat le sens suivant: Pour aucune valeur 
de l’entier n, il n’existe une base d’ordre n de l’espace, c’est 4 dire un nombre n 
d’éléments fi, --- , fn de l’espace tels que tout élément f de l’espace puisse 
étre représenté sous la forme 


Sf = eifi + Cofe +--+ + Orfn, 


ol C1, --- , C, sont des nombres réels ou complexes convenables. 
Postutat V. L’espace considéré est séparable (E. A., p. 189) et complet 
(E. A., p. 74). 


Introduction du produit scalaire symétrique. On peut |’introduire naturel- 
lement 4 partir de la notion de distance, comme nous le ferons plus loin. Mais 
pour prendre d’abord comme point de départ la définition de von Neumann, 
nous le définirons par le symbole (f, g) et la formule 


(10) (f, 9) = 2(9, 9)) + 2(@,)) - 


Propriétés du produit scalaire symétrique. I] est clair que le nouveau pro- 
duit scalaire (f, g) est symétrique. Par conséquent, la propriété y) devient 


7’) (9,f) = VU, 9)- 
D’autre part, d’aprés (7) 
(11) (f,9) = 219, 9)) + 29, 9))*, 


done le produit scalaire symétrique (f, g) est un nombre réel. 


On a d’aprés la définition de (f, g) 
Gf) = (7, f)). 














708 MAURICE FRECHET 


Les conditions 5) et «) deviennent donc 
5’) (f, f) est un nombre réel = 0, 
e’) (f,f) = Osi f = 0 et dans ce cas seulement. 
Passons 4 8). On a, d’aprés (11) 
(f+ gl, h) = 20(Lf + gl, h)) + 20 + gl, 4))* 

Par suite, la condition 8) devient 
8’) (f+ 9], h) = (fh) + GY, A). 
Enfin appliquons a), on a 

(cf, 9) = 2((cf, 9)) + 2, 9))* 

= 3c((f, 9)) + ac*((f, g))*. 

En particulier, on a la conséquence a,) de a): 
a1) (rf, 9) = r(f,9) pour r réel. - 


On déduit aussi de a): ((if, 1g)) = «((f, ig)) = 1(—2) (7, g)) = (UY, g)). D'or 
(of, ig) = 3((¢f, 19) + 2((%g, #)) = 27, 9)) + 2G, P)) = GV, 9), et par suite 


a>) (7f, ag) = (f, g). 


Enfin, observons que si |’on ne connait pas les valeurs de ((f, g)), mais seule- 
ment celles de (f, g), on peut retrouver les premiéres. On a, en effet 


(F, ig) = Hf, ia) + 4, ig))*, dod d’apras a) 
f,i9) = — 3,9) +30, 9))* ou 


uf, 79) = 3((f,9)) — 3, 9))*; avec 
(f,9) = 1,9)) + 207, 9))*, done 
(12) (f,9)) = U9) + 27, 7). 


Définitions équivalentes. Observons méme qu’on obtiendra une défini- 
tion de l’espace de von Neumann équivalente A la sienne en remplagant le 
Postulat II, par le Postulat II’ suivant: 

Postunat II’: (f, g) est un nombre réel vérifiant les conditions a), ay), 
B’), v’), 6’), e’) ci dessus et on pose: 


(4,9) = G9) + 7, 29). 


On vient de démontrer que le Postulat II’ est une conséquence du Postulat Il 
(quand le postulat I est vérifié). Réciproquement le Postulat II est une consé- 
quence du Postulat II’ (quand le Postulat I est vérifié). . 








ESPACES APPLICABLES SUR L’ESPACE DE HILBERT 709 


En effet; on tire de as) 
Ff, 19) = (of, 1g) = (xf, —9), 
quantité égale 4 — (if, g) d’aprés a). Donec 
(99) = @ I) +19, f) = U9) — UF, 9) = (TF, 9))*. 


La condition 7) est donc vérifiée. 
Partons maintenant de 6’). Ona 


((f + gl, h)) = (If + gl, h) + af + gl, th) 
quantité qui d’aprés 6’) est égale a 
(f,h) + (g, h) + af, th) + ag, th) = ((f, h)) + (9, h)) 


ce qui établit la formule 8). 
Posons enfin c = A + 7B. 
On aura: 


(cf, g)) = ((Af + @Bf, 9)) = (CAS, 9)) + (BF, 9). 
D’aprés a,) et a,) ona 
(Af, g)) = (Af, 9) + wf, 19) = ACS, 9) + Ath, 9) = ACY, 9), 
(((Bf, 9)) = (Bf, g) + w(eBf, ig) = (—Bf, ig) + (Bf, 9) 
= —B(f, ig) + Bi(f, g) = «BIS, 9) + «(f, t9)] = BUS, 9). 


Done ((¢f, g)) = A((f, g)) + «BC, g)) = c((f, g)) et la condition a) est bien 
vérifiée. 


Enfin, on a, d’aprés (12) 


(YF, f)) ie ff) + i(f, if). 
Or, en vertu de a;) et a,), ona prouvé plus haut que (f, ig) = —(g, if). Done 


f, if) = —(f, if), d’ou J, if) = Oet 
(9, f)) - (f, f). 


Dés lors les conditions ¢’) et 5’) entrainent e) et 4). 

En résumé, quand le Postulat (I) est vérifié, les Postulats II et JI’ sont 
équivalents quand on adjoint a II la définition (10) de (f, g) ou A II’ la défini- 
tion (12) de ((f, g)). 

Il nous parait préférable d’employer le second systéme; car dans un grand 
nombre de questions, c’est le produit scalaire symétrique (f, g) qui jouera le 
réle prédominant, le produit scalaire dissymétrique ((f, g)) pouvant méme ne 
pas apparattre. 











Aa Bo mre oi -, 











710 MAURICE FRECHET 


Les deux produits scalaires symétriques et dissymétriques déduits de la 





notion de norme. Dans l’espace de von Neumann, la distance fg = || f — g || 
s’exprime connaissant le produit scalaire symétrique, sous la forme 
(13) fo = V(t — 91, Uf — 9). 


Observons qu’ inversement le produit scalaire symétrique peut s’exprimer con- 
naissant la distance. (Il en sera de méme, par conséquent, d’aprés (12), du 
produit scalaire dissymétrique). 

En effet, on a, en vertu de 8’) et vy’): 


SI? + lg 1? + 2, 9). 


D’ou, l’expression de (f, g) au moyen de la norme 


f+9\? 


(14) (f,9) = allf+alP? — (Isl? — Ilo IPL. 
Un changement de signe de g donnera, en vertu de «,), la nouvelle expression 
(15) (f,9) = alll F I? + Ilo IP? — lf — 9 IP. 


On doit done pouvoir traduire en termes de distances les conditions imposées 
aux produits scalaires et inversement toute condition imposée 4 la distance doit 
pouvoir se traduire en termes de produits scalaires. 

Traduisons d’abord le postulat II’ (en admettant le Postulat I). Comme 
toute norme || f || est en méme temps une distance Of, nous pourrons effectuer 
la traduction en termes de norme. 


Propriétés de la norme dans un espace de von Neumann. La condition ’) 
devient, en vertu de (14) 


0 = 2(f + g],h) — 29,h) — 2g,h) = lf +9 + Al? — lif +a ll 
— l]AIP— f+ Al? t+ (si + AIP Ilo +h IP + Ilo LP + TA IP, 
ou: 
By) Wlf+gthlP— lista lP—- listale— llh+olP + liste 
+ IlglP + [AIP = 9. 


De a) et y) on tire 


(16) ((cf, «g)) = cei ((f, 9)), 
d’ou la condition 


(17) || of ||? = |e |? || f ||? (pour ¢ réel ou complexe). 





ESPACES APPLICABLES SUR L’ESPACE DE HILBERT 711 


D’aprés as) 
a”) (if, f) = Vf, f), dor 
of Il = UF Ils 


cette derniére égalité est un cas particulier de (17). D’aprés a,), et y’), sir, s 
sont réels 


(sf, rg) = rs(f, g). 


Traduisons cette derniére égalité au moyen de (14), on aura, en tenant 
compte de 17) 


lif +r9 1? —# IFIP — lig IP = rsflif+ ol? — IFIP - Ilo I 
ou: 
a) lif +r lP =e lISIP +r ilo li? + rstlif+a 1 — IsIP — lo 1, 


On peut méme réunir en une seule condition a) et B;’’). Soient en effet 
2, y, 2 trois nombres réels arbitraires. On aura en vertu de {,’’) 


laf + yg + zh ||? = || af + yg ||? + || yg + 2h |? + || ch + af ||? 
— |] af ll? — |lyg |? — || 2h |I* 
En vertu de a), et (17), cette égalité devient: 


e") llaft+yg teh iP =z ils lP+yligl?+2 |r 
+yeillo t+ hl? — lg i? — Arle} + erfllkts ie — AI -— NPI 
+ aff +l? — Isl — Ig IP 


pour 2, y, z réels. 

Si 6”) est vérifié, on obtient 6) en y faisant x = y = z = let a) eny 
faisant z = 0. On obtient (17) pour c réel en faisant y = z = 0 dans 6”). 
Nous verrons plus loin une interprétation geométrique simple de cette condi- 
tion 6’’). Les conditions y’), 5’), e’) se traduisent naturellement par les condi- 
tions respectives 


7") || -f || = ||f ||, mais cette égalité est aussi un 
cas particulier de 6’’) pour 
z=-ly=2z=0. 

6”) \|f || = nombre réel = 0 


é \|f || = Osif = 0 et seulement dans ce cas. 


(Les conditions 4’), e’’) et (17) sont admises d’avance dans les espaces vectoriels 
distanciés) 











i) 














712 MAURICE FRECHET 


Postulats équivalents. Réciproquement, supposons qu’a tout élément f 

d’une classe d’éléments abstraits vérifiant le postulat I, on associe un élément 

| f || vérifiant le postulat II’ composé des conditions a’’), B’’), &’), e”’). Etu- 
dions la quantité 


(14) (f,9) = llf+alP — Isl? — Ilo IP. 


En vertu de (14), la condition y’) sera vérifiée, et (f, g) sera un nombre réel en 
vertu de 6’). De plus, d’aprés (14) et 6’’), ce dernier pour z = 2, y = z = 0: 


(ff) = tll fl — 21F1P} = IFIP. 


L’égalité ainsi obtenue permet de déduire az), 5’) et €’) de a’”), 5”) et €”). 
Enfin on déduit a) de 6’), en déduisant a) de 6’), puis en tenant compte 
de (14) dans a), pour r = 1. 
Finalement, quand on admet le postulat I, on peut y adjoindre indifférem- 
ment, soit avec von Neumann, le Postulat II concernant le produit scalaire 
dissymétrique avec la définition de la norme par l’égalité 


Is I? = (GSP) 
et la définition 


(f,9) = 314, 9)) + 3(@, f)) 


du produit scalaire symétrique; soit, comme il vient d’étre établi, le Postulat II” 
concernant la norme et les définitions 


(15) F,9) = 2tll sl + Ilo 1? — lf — 9 IP 


du produit scalaire symétrique, 


(Y, g)) = (f, g) + uf, 7g) 


du produit scalaire dissymétrique. 


Interprétation géométrique du produit scalaire symétrique. Considérons 
plus généralement un espace distancié quelconque, c’est 4 dire une certaine 
classe d’éléments associée 4 une certaine définition de la limite telles qu’on 
puisse y définir une distance (vérifiant les conditions P), Q), R), 8), par défini- 
tion de la distance). 

En vertu de l’inégalité triangulaire R), 4 trois éléments distincts f, g, h de 
cet espace correspond un triangle euclidien (aplati ou non) FGH ayant pour 
mesures des cétés les nombres gh, hf, fg. Les angles de ce triangle ont des 
mesures, comprises entre 0 et 7, bien déterminées par les relations telles que 


fg? = fh? + hg? — 2hf-hg-cos a. 


Il sera alors naturel d’appeler, comme en géométrie euclidienne le produit 
scalaire (symétrique) de hf, hg, la valeur de fh-gh-cos a, ¢’est & dire la quantité 


2lhf? + hg?'— fg’). 





ESPACES APPLICABLES SUR L’ESPACE DE HILBERT 713 


Pour parler du produit scalaire (symétrique) de deux éléments f et g, il suffira 
de choisir un point @ fixe et de prendre pour produit scalaire (symétrique) (f, g) 
de f et g, celui de @f, 69. On posera donc 


Gf, 9) = 216 + 69? — fg’). 


Il est clair que (f, g) vérifiera les conditions y’) et 6’). 

Les conditions a), a;), 8’), e’) font intervenir l’addition et la multiplication 
par un nombre. Elles n’ont donc pas de sens pour un espace distancié quel- 
conque. Mais supposons maintenant que l’espace distancié considéré soit un 
espace distancié vectoriel, vérifiant par conséquent le Postulat I et supposons 
qu’on prenne pour @ le point 0 du Postulat I. Alors si l’on a un espace de von 
Neumann, la distance fg = || f — g || doit étre une quantité qui ne dépend pas 
d’une fagon quelconque de f et de g, mais qui est déterminée par f — g. Ainsi 
on doit assujettir la distance 4 la condition supplémentaire 


a) fg =figi quand g—-f=g-—-/fi. 


Réciproquement si on a un espace vectoriel distancié comme la distance y 
vérifie la condition a), on pourra, par définition de la norme, poser 


I|f || = Of et par suite, sig —f=¢ 


Ilf—g ll = lle ll = O¢ et puisque 
g —f = ¢ — 0, on aura en vertu de a) 
fg = 0 = |le || = llg-F il, 
c’est a dire 
fg = \lg —fil. 
D’ov enfin 
(15) I, 9) = alls ll? + lig ll? lg -—F IPI. 


L’introduction, faite plus haut, de cette formule analytique se trouve mainte- 
nant justifiée par la signification géométrique qui lui a été donnée. Observons 


qu’en raison de la condition P) on a ||g — f || = ||f — g ||, en particulier, 
pour g = 0: 

lS Il = IF Il. 
D’ou puisque g — f = —(f — g), 

G,9) = @,f). 


Restera done seulement & vérifier la condition 6”). 


L’espace de von Neumann défini comme un espace vectoriel distancié 
particulier. Considérons un espace vectoriel distancié. II est facile main- 
tenant d’indiquer 4 quelle conditions ce sera un espace de von Neumann. 

















714 MAURICE FRECHET 


En appelant norme de f la distance Of les conditions 6’) et &”) seront véri- 
fiées d’elles mémes. 

Si dans les conditions ordinaires auxquelles sont soumises les espaces vec- 
toriels, (E. A., p. 140), on suppose que la multiplication par une constante 
est étendue au cas ot cette constante est complexe, alors comme on suppose 
pour un espace vectoriel distancié que 


[Lae || = la] lléll, 


la condition a’’) se trouvera vérifiée d’elle-méme pour a = 12. 

Ii ne reste plus qu’a écrire en termes de distances la condition B’’) et les 
formules (15) et (12). Pour cela, il suffit de substituer a f, g, h leurs différ- 
ences f — k, g — k, h — k avec un quatriéme élément arbitraire, aprés avoir 
remplacé partout || f + g ||? — ||f ||? — ||g ||? par la quantité égale 
fil? + Ilg ll? — ||f-—g||%. Dans ces conditions, 6’’) devient: 


ov? = xifk? + y2gh® + 22hk® + ye(gh? + hk? — gh?) + zx(hk? + fk? — hf?) 
+ xy(fk? + gk? — fg"), pour z, y, z réels, 
avec 
¥—eg=2xf—k)+y¥g—hk) +h —k), 
et (15), (12) deviennent 
(f — k,g — k) = 2(fk’ + gk? — fg’), 
((f,9)) = FG, 9) + 7, 92). 


Dés lors, un espace de von Neumann peut étre défini comme: 

un espace vectoriel distancié séparable, complet et d une infinité de dimensions, 
ou les propriétés de la multiplication par une constante se trouvent étendues au 
cas ou cette constante est complexe, (la valeur absolue de cette constante étant rem- 
placée par son module), ou se trouve vérifiée la condition 


ey? = vk? + yrkg? + kh? + ye(kg? + kh? — gh’) 
+ za(kh? + kf? — hf?) + xy(kf? + kg? — fo’) 
pour x, y, z réels, avec 
¥—¢e=2xf—k)+yg —k) + 2h — k) 
et ou l’on a posé d’abord 
(f,9) = 210f? + 09° — fq’) 

puis 

| (Jf, 9)) = Ff, 9) + 7, 79). 


Interprétation géométrique de la condition 6’’). La condition 6’) a une 
forme analytique qui ne parle pas a l’esprit. On peut lui donner une forme 








ESPACES APPLICABLES SUR L’ESPACE DE HILBERT 715 


qui sera plus intuitive, bien qu’en apparence plus compliquée. Rappelons 
cette condition 8”), que nous écrirons sous la forme: 


— yz(gh? — kg? — kh®) — 2x(hf? — kh? — kf?) — axy(fg? — kf? — kg). 


Or, on a vu qu’il existe un angle a compris entre 0 et 7 (éventuellement nul ou 
égal A x) tel que gh? = kg? + kh? — 2kg-khcosa. Et des angles analogues 8, y. 
Donec 8;) devient: 


By) \le(f — &) + yg — &) + 2(h — k) |? = akf? + yrkg? + 22kh? 
+ 2kg-kh-yz cos a + 2kh-kf-zx cos B + 2kf-kg-xy cos y. 


Démontrons qu’A tout systéme de quatre points distincts f, g, h, k d’un 
espace vectoriel distancié vérifiant la condition 6), on peut toujours faire 
correspondre quatre points F, G, H, K de l’espace euclidien dont les distances 
mutuelles sont égales 4 celles de f, g, h, k. Nous allons supposer les 4 points 
f, 9, h, k distinects. Le premier membre de 62) est toujours = 0 quand z, y, z 
varient. Considéré comme un trinome en kf-x, le second membre doit donc 
avoir un discriminant A; S$ 0. Ona 


A; = (kh-z cos B + kg-y cos y)? — (y’kg? + 2kh? + 2kg-kh-yz cos a) S 0 
ou: 
—kh?z? sin? B — kg?-y? sin? -y — 2kg-kh-yz (cos a — cos B cos y) S 0. 


Considéré comme trinome en kg-y le premier membre doit done avoir un dis- 
criminant < 0: 

kh?.z? (cos a — cos B cos y)? — kh?z? sin? B sin? y S 0 
ou 


(cos a — cos 8 cos 7)? — sin? B sin? y S 0. 


Faisant usage de cette condition, on va montrer qu’on peut déterminer F, G, 
H, K. Prenons en effet un systéme d’axes rectangulaires Ozyz. Montrons 
qu’on peut prendre K en O, F sur Ox,G dans zOy. Les coérdonnées de K seront 
0,0,0. De F: a’, 0, 0; de G: a’, b’, 0, de H: a, b, c. Il suffit de prouver 
qu’on peut déterminer ces codrdonnées de sorte que 


q’”’ oan kf, a” + b”? ee kg’, a2 + b? + ce “_ kh?, 
(a’ — a”)? + b” = fg’, (a — a”)? +0? + c? = fh’, 
(a—a’)?+ (b-bd)? +e = hg’. 











apt aes Ss 2s is - 


Dist Re crap Rape 





716 MAURICE FRECHET 


Les trois derniéres équations peuvent étre remplacées par 
2a’a” = kg? + kf? — fg’, 2aa"’ = kh? + kf? — fh?, 


2aa’ + 2bb’ = kg? + kh? — hg’ 
ou 


a’a” =kg-kf cosy, aa’ =kh-kf cosB, aa’ + bb’ = kg-kh cos a. 
Donec: 
a’ =kgcosy, az=khecosB, 6b’ = kg-kh (cos a — cos B cosy). 
Or les premiéres équations donnent: 
b”? = kg? — a”? = kg’ sin’ 7; 


on peut prendre b’ = kg-sin y; b = kh(cos a — cos B cos y)/siny. Les quanti- 
tés a, a’, a’, b, b’ sont ainsi déterminées. II reste a calculer c: 


c? = kh? — a? — b? = kh?{1 — cos? B — (cos a — cos B cos y)*/sin? y} 


{sin? B sin? y — (cos a — cos B cos 7)*}. 





im sin? y 

Or l’accolade, comme on |’a vu plus haut, est 2 0. On peut donc aussi cal- 
culer c. Finalement on a pu déterminer F, G, H, K. 

Considérons maintenant deux points 1, m (de l’espace abstrait) appartenant 

4 la multiplicité linéaire ¢ détérminée par f, g, h, k, c’est a dire représentables 


sous la forme 
l 


m 


k + a(f — k) + bg — k) + eth — k) 
k+a(f — k) + bg — k) + ath — k) 
ou a, b, c, a, b1, c: sont six nombres réels queleonques. On a 
L—m = (a — a)(f — k) + (6 — big — k) + (ce — ath — k) 
eten posantz =a—ay—b—bh,z=c-—aQ 
L—m = xf —k) + yg — k) + 2(h — k). 
Soient L, M les points de l’espace euclidien représentables par 
L=K+a(F — K) + 0G — K) + c(A — K) 
M=K+a\(F — K) + )(G — K) + «(H — K). 


Ils appartiennent 4 la multiplicité linéaire 7 déterminée par F, G, H, K et y 
correspondent 4 1, m. Les formules qui établissent la correspondance entre / 
et L montrent que cette correspondance est vectorielle. Quand la condition 6”) 
est vérifiée, cette correspondance est en outre une application, c’est & dire con- 
serve les distances. En effet, en vertu de 83), on a 


lm? = 2°kf? + --- — ye(gh? — kg? — kh?) — ---. 





ESPACES APPLICABLES SUR L’ESPACE DE HILBERT 717 


D'ou: 
Im? = a2KF? + yKG? + 2KH? — yz|GH? — KG — KH?| 
— 2x[HF? — KH? — KF?) — xy[F@? — KF? — KG?]. 


Mais cette expression est, comme on s’en assure facilement, égale aussi 4 celle 
du carré de la longueur du vecteur euclidien 


M—L=dx(F — K) + y(G — K) + 214 — K), ou 
LM = xKF + yKG@ + 2KH. 


Les distances lm, LM sont donc égales. 

Quand les points F, G, H, K forment un tétraédre, la multiplicité 7 coincide 
avec l’espace euclidien tout entier. Dans ce cas, la multiplicité ¢ est ce que 
nous pourrons appeler un “hyperplan” de l’espace vectoriel considéré. C’est 
le cas o1 F, G, H, K n’appartiennent pas 4 un méme plan. Pour que F, G, 
H, K appartiennent 4 un méme plan, il faut qu’il existe des nombres u, v, w 
non tous nuls tels que, par exemple 


u(F — K) + o(G — K) + w(A — K) = 0. 
D’od, en vertu de 6%) 
0 = ||u(F — K) + 0G — K) + w(H — K) |i 
= wKF? + ... — ww[GH? — KG? — KH?’ — --- 
= whkf? + -.- — vw[gh? — kg? — kh’] — --- 


= || u(f — k) + og — k) + wh — k) |? 
et par suite 
u(f — k) + og — k) + wih — k) = 0.7 


Dés lors f, g, k, & appartiennent 4 ce qu’on peut appeler un plan abstrait de 
l'espace considéré. Réciproquement, il est clair que sif, g, h, k appartiennent a 
un méme plan abstrait (ou multiplicité linéaire 4 2 dimensions) F, G, H, K 
appartiennent 4 un méme plan euclidien. 

Par suite sif, g, h, k, n’appartiennent pas 4 un méme plan abstrait, F, G, H, K 
sont les sommets d’un tétraédre et la multiplicité linéaire T 4 trois dimensions 
déterminée par F, G, H, K, n’est autre que l’espace euclidien. Celui-ci se 
trouve done applicable sur la multiplicité linéaire, t, déterminée par f, g, h, k 
qui ne sont pas dans un méme plan abstrait. ¢ est une multiplicité linéaire 
qui a exactement 3 dimensions et que nous pouvons appeler un hyperplan. 

Finalement: en vertu de la condition 6’) tout hyperplan de V’espace abstrait 
de von Neumann est vectoriellement applicable sur l’espace euclidien, c’est d dire 
en correspondance ponctuelle biunivoque avec conservation des vecteurs et des dis- 
tances, 

Inversement, cette condition de forme géométrique entraine la condition 6’). 





~ ooteegne 


| 
j 
} 
i 
| 














pt ee 


a : 








718 MAURICE FRECHET 


En effet soient f, g, h, k quatre points d’un hyperplan abstrait non situés sur 
un plan abstrait. Il existe une application de cet hyperplan sur l’espace 
euclidien qui transforme f, g, h, k en quatre points F, G, H, K formant un 
tétraédre de l’espace euclidien et deux points 


l=k-+a(f—k) + bg —k) + ch — k) 
m=k+a(f—k) + big — k) + ah — k) 
en deux points 
L=K+a(F — K)+ 0G — K) + c(H — K) 
M=K+4+a\(F — K) + )(G — K) + «(H — K) 


(c’est du moins ainsi que nous définissons la correspondance envisagée). On a 
par hypothése 


lm = LM et on a dans I’espace euclidien 


LM? = 2?KF? + .-. — ye(GH? — KG@’ — KH?) — .-. 
= Pk? + .-- — yz(gh? — kg? — kh?) — ---. 
D’ou enfin 
Il 2(f — k) + yg — k) + 2(h — k) |P? 
= wkf? + --- — ye(gh? — kg? — kh?) — .--- 


c’est 4 dire que la condition p’’) est vérifiée. 

En résumé, dans les définitions de von Neumann, on peut remplacer les 
Postulats II, III qui concernent le produit scalaire dissymétrique, suivis de la 
formule définissant la distance fg = ~/ (([f — g]-[f — g])) 4 partir de ce produit 
scalaire, par l’hypothése de l’existence d’une distance vérifiant les condi- 
tions P), Q), R), 8), telle que la distance fg ne dépend que de f — g et telle de 
plus que tout hyperplan de l’espace considéré puisse étre mis en correspondance 
ponctuelle biunivoque conservant les distances et les vecteurs, avec |’espace 
euclidien.‘ 





UNIVERSITE DE Paris. 





* Nous sommes informé, au moment de corriger les épreuves, que M. M. von Neumann 
et P. Jordan ont pu apporter a ce résultat une importante simplification—qui se trouvera 
exposée dans ce volume—en remplacant les hyperplans et l’espace eucliden par les plans 
abstraits et un plan euclidien. 





1 








Annas OF MATHEMATICS 
Vol. 36, No.3, July, 1935 


ON INNER PRODUCTS IN LINEAR, METRIC SPACES 
By P. JorpAN AND J. v. NEUMANN 
(Received April 16, 1935) 


1. In his foregoing paper! Mr. M. Fréchet discussed the following question: 
When is a linear, metric space L isometric with a generalized Hilbert space?? 
In other words: When can one define in it a bilinear symmetric inner product, 
from which its given metric can be derived in the customary way? Mr. Fréchet 
discovered a necessary and sufficient algebraic condition, from which he derived 
this more abstract criterium: This is the case if and only if every S 3-dimensional 
linear subspace L’ of L is isometric with a Euclidean space. 

On the pages which follow we will derive another necessary and sufficient 
algebraic condition, which implies that Mr. Fréchet’s abstract criterium can be 
weakened as follows: The answer to the above question is affirmative if and 
only if every S 2-dimensional linear subspace L’ of L is isometric with a Euclid- 
ean space. 

The criterium which we will derive has some further interest, because it shows 
that the linear spaces with a bilinear symmetric inner product are in a certain 
sense limiting cases of the general linear metric spaces. 

We will only consider complex linear spaces. Real linear spaces could be 
discussed along the same lines, even with some simplifications. Mr. Fréchet 
showed, loc. cit., how the two types of linearity are connected. 


2. We repeat the customary definitions of linearity, metricity, and of the 
bilinear symmetric inner products. As we consider complex linearity, the 
symmetry of inner products will be interpreted as Hermitian symmetry. 

The definitions in question are: 

DerFINITION 1.3 A space L is (complex) linear and metric, if for all f, g « L and 





1 Sur la définition axiomatique d’une classe d’espaces vectoriels distanciés applicables 
vectoriellement sur l’espace de Hilbert, Ann. of Math. (2) 36 (1935) pp. 705-718. 

* Hilbert space is uniquely characterized (up to an isomorphism) by five postulates 
A-E, which have been formulated by one of us, Math. Ann. vol. 102 (1929), pp. 63-66. Of 
these postulates A, B are the really essential ones: The omission of C would only add the 
(finite dimensional) Euclidean spaces; the omission of D could be compensated by the 
standard manipulation of ‘‘completion;’’ in the absence of EZ essentially new hyper-Hilbert 
spaces arise, but they are nevertheless similar to Hilbert space under most aspects. 

The hyper-Hilbert spaces (without HZ) have been first discussed by H. Léwig, Acta 
Szeged, vol. 7 (1934), pp. 1-33. The ‘““completion’”’ of spaces without D plays a fundamental 
role in the work of K. Friedrichs, Math. Ann., vol. 109 (1933), pp. 472-476. A connected 
discussion of the respective réles of the conditions A-E is to be found in a series of mime- 
ographed lectures on operator theory, given by one of us in Princeton in the year 1933-34. 

*Cf. 8. Banach, Fund. Math., Vol. 3, pp. 134-136 (1922); and H. Hahn, Monatshefte 
fiir Math. und Phys., Vol. 92, pp. 1-4 (1922). 


719 














720 P. JORDAN AND J. V. NEUMANN 


all complex numbers a the quantities a-f, f + g « L and the real number | f 
are defined, with the following properties: 
1. a-f, f + g obey the rules of the vector-calculus: 


11. 1-f =f, 0-f = 0 (independent of f), 

12. (a+ 8) f=af+B-f,a-(f+g) =af+a-g, 
13. aB-f = a-(6-f), 
4.ftg=9+hf+9+h=f+ot+h. 


(We use the customary notations (—1)-f = —f, f+ (—g) =f —4g.) 
2. || f || obeys the rules for an absolute value: 


21. || f || > Oiff <0, 
22. |If+gl Sisi+igt) 
23. || a-f || =|a]- {lf 
DeFIniITION 2. A space L is generalized (complex) linear and metric, if it ful- 


fills all conditions of the preceding definition, except for 23, which is replaced 
by the weaker condition: 


23’. || a-f || > 0 if a> 0 and || || = || f |. 


Derinition 3.4 A space L is linear with a (bilinear and symmetric) inner 
product, if for all f, g « Z and all complex numbers a the quantities a-f, f+geL 


and the complex number (f, g) are defined, with the following properties: 
1. As in Definition 1. 


2. (f, g) is linear in f, conjugate linear in g, Hermitian-symmetric, and definite: 
21. (a-f,g) =a-(f, 9), 
22. (f+ fh, 9 = (59 +99), 
23. (f, 9) = gf). 
23 transforms 21, 22 into 21* (f, a-g) = &-(f, g) and 
22* Fg tg )=h9)+49"). 
24. (f, f) (which is real by 23) > Oif f ¥ 0. 


Well known considerations show that || f || = /(f, f) fulfills the conditions 2 of 
Definition 1.6 In this sense every linear space L with an inner product is at the 


same time a linear space with a metric. We call || f || = ~/(f, f) the metric de- 
rived from the inner product (f, g). 





‘ Cf. footnote 2, we are using the conditions A, B. 
> Cf. pp. 64-65 in the Math. Ann. paper quoted in footnote 2. 








INNER PRODUCTS IN LINEAR, METRIC SPACES 721 


3 We will now determine all generalized linear, metric spaces L, which can be 
considered at the same time as linear spaces with an inner product. This is 


our criterium : 
TueorEM I. Let L be a generalized linear, metric space, with the operations 


a-f, f+ 9, ||f ||. In order that it be possible to define an operation (f, g) in L, so 
that L with the original a-f, f + g and this (f, g) is a linear space with an inner 
product, and that the original || f || is the metric derived from this (f, g), the following 
condition is necessary and sufficient: 


(*) Wf+tgP? +f —9 |? =2 (SI)? + |g |) forallf,geLé 


If this is the case, then (f, g) ts uniquely determined by the above requirements. 
Proor: Assume first that an (f, g) of the desired sort exists. Relations 21, 21* 
(for a = —1) and 22, 22* from Definition 3 give immediately: 


The first equation coincides with (*), thus (*) is necessary. The second equa- 
tion gives, considering 23, R(f, g) = 3(|| f +9 |? — || f —g||?). 21 (fora = i) 
gives (-f,g) =7-(f,9), R@-f,g9) = —S(V,g). Thus || f || (forall f eZ) determines 
R(f, g) and & (f, g) uniquely, so that (f, g) itself is unique. So we need only to 
prove the sufficiency of (*). 

Assume therefore that (*) is fulfilled. We must define (f, g), but we know that 
the only possibility is 


(RS, 9) = Ulft+al? —if—g\)), 


So our task is to verify 21-23 (Definition 3) and || f || = W(f, f). 
Replace f, g in (*) by f’ + g, f’’, and subtract. Then 


+s talk — Ise ts” — ol + ls — sf +ok - is - 5" — oi? 
=2(f +l? - Is -9 I) 


(#) 


obtains, that is (considering (# )): 


(§) RI +L", 9) + RY’ — I", g) = 2RIS', 9). 


(*) with f = 0 proves || —g ||? = || g ||?, and so (#) with f = 0 gives ® (0, g) = 0. 
Therefore (§) with f’ = f’’ is R (2f’, g) = 2R (f’, g). Thus (§) itself becomes 


R (f’ +f"; g) + Rf’ —f"; g) c R(2f’, 9); 





* This equation occurred in a paper of E. Wigner and the authors, Annals of Math., vol. 
35 (1934), p. 32. Its importance in generalized Hilbert space has been pointed out by F. 
Riess, Acta Szeged, vol. 7 (1934), p. 36, cf. equation (6), loc. cit. 











722 P. JORDAN AND J. V. NEUMANN 


or if we replace f’, f’’ by 3(f’ +f’), 3’ —f"): 

RIP NARI) =RSO +H", 9). 
PERE Now (#) gives immediately 
aE GO+G, 90 =04I59 
By proving 22 (Definition 3). 


Hi 4 i 22 (Definition 1) with f = f”,g = f’ — f” gives || f’ || — ||f’" || S ||/’ — fs’ |. 
eae Interchanging f’, f’’ now shows || f’’ || — || f’ || s/f’ —f' || =| sf’ -— fs” | 
Wt (remember || — g || = [1g 1), thus ||’ — IF” I] Ss’ —J" |. Therefore 

it Hef gl|—6-f+9l]|Sl|(@—8)-f||. Soa—>B implies by 23’ (Defini 


tion 2) ||a-f+g||— || 8-f+q || that is, || a-f + g || is continuous ina. Now 
3 by (#*) R (a-f, g) and (a@-f, g) are also continuous in a. 

Consider now the set S of all a’s for which 21 (Definition 3) holds. 1 « S is 
obvious; by 22 (Definition 3) a, Be SimplyatBeS. Soalla=0,+1,+2, 
eee --- areinS. Clearly a, 8 eS,8 +0,imply a/8 e S,so all rational a’s are in S. 
fi The above proved continuity of (a-f, g) in a implies that S is closed, so all real 
a’sarein S. Finally ( # ) givesieS (use (—f,g) = — (f,g), as —1 €S!), there- 


° . a ° 
fore if a, a2 are real a, — tag = a, + =e S. Thus all complex a’s are in S, 
1 


that is 21 holds always. 
As || 2f || = || f ||, the first equation in ( * ) gives R (7f, ig) = R (f,g). It gives 
immediately R (f, g) = R (g, f) too, and these combined give 


RY, 9g) = KR7-if, wg) = R(-f, wg) = —R, ig) = — KR, f). 


So (f, 9) = (g,f), proving 23 (Definition 3). 21 and 23 (all in Definition 3) 
imply 21*, and thus (af, af) = | a (f, f). 

Now (#) gives R(f, f) = 3 (]2-F|? — OF) = (SIAR CLA =2 
(| @+4)-f |)? — || —#-F |) = 0s0(f, f) = ||f |? This proves 24 (Defini- 
tion 3), and || f || = V(j,/). 

Thus the proof is completed. 


4. The condition that every < 2-dimensional subspace L’ of L be isometric 
to a Euclidean space, is obviously necessary for the existence of an inner product 
in the generalized linear, metric space L. It is sufficient, too, because if it is 
fulfilled, we can argue as follows: If fo, go « L the space L’ of all a-fy +8-9o (a, 8 
arbitrary complex numbers) is S 2 dimensional, thus (*) holds in L’ (as in every 
Euclidean space). Therefore it holds in particular for f = fo, g = go, and as 


fo, go are arbitrary, Theorem I proves the existence of an inner product. 





5. The following theorem holds for linear, metric spaces: 
THeoreEM II. Let L be alinear, metric space. Define 


rat Re Seo 





ae 1 iftglet+il¢—glk 
ft C = = f L, tj=9g= 0 
: ho = 35° TF Ig Ie Pte roman #6 


if eben. 


peers, 


Bite Pr tia Riga 








INNER PRODUCTS IN LINEAR, METRIC SPACES 723 


and denote the l.u.b. of the C;,q by b, and their g.l.b. bya. Then we have 


IA 
lA 


1<b 


IIA 
lA 


. 
7 


The linear spaces with a (bilinear and symmetric) inner product represent the 


5 a 2,a= 


extreme case 
ax b = 1. 


Proor: ClearlyO Sab. 22 (Definition 1) gives: 
1 2-(Fi+ilg id? £1 2-2-1 sik +l ol) _ 4 














C a =F ’ 
ger. If 1? + Il g IP 2 Fl? +19 |? 
sob < 2. 23 (Definition 1) for a = 2 gives: 
Cc pep ed 9+ U2 9, USIP + Uo 
tet 2 Wf+olP + lf -—g IP If+glP+Ilf—gl? Cro’ 
s0 a = . Thus b S 2 implies a = > and a S bimplies a < 1 S b, together: 


4<aS15082. 
That linear spaces with an inner product are characterized by a = b = 1 
is the statement of Theorem I. 


Roustock, GERMANY AND PRINCETON, N. J. 


i 
if 

















722 P. JORDAN AND J. V. NEUMANN 


or if we replace f’, f’’ by 3(f’ +f), 3’ — f"): 
Rg +RGI9 = RS +H", 9). 
Now (#) gives immediately 
G9+, 9 =04+h.9) 


proving 22 (Definition 3). 


22 (Definition 1) with f = f”’,g = f’ — f” gives || f’ || — ||f’’ || Ss ||f’ — ff’ |). 
Interchanging f’, f’’ now shows || f’’ || — || f’ || S$ | f"’ —f' || =||f -— 7" | 
(remember || — g || = || g ||), thus] || f’ || — | f’ |] |S || f’-—f’ ||. Therefore 
l|e-f+g||—||6-f+ ||| <||(e—8)-f||. Soa— 8 implies by 23’ (Defini- 


tion 2) || a-f+qg||— || 6-f+q || that is, || a-f + g || is continuous ina. Now 
by (#) MR (a-f, g) and (a-f, g) are also continuous in a. 

Consider now the set S of all a’s for which 21 (Definition 3) holds. 1 Sis 
obvious; by 22 (Definition 3) a, 8« Simplya+BeS. Soalla =0,+1,+2, 
--. areinS. Clearly a, Be S,8 + 0,imply a/8 e S,so all rational a’s are in S. 
The above proved continuity of (a-f,g) in a implies that S is closed, so all real 
a’sareinS. Finally (#) givesz eS (use (—f,g) = — (f,g), as —1¢S!), there- 


° . a ° 
fore if a, a are real ay — tag = a + - e S. Thus all complex a’s are in S, 


that is 21 holds always. 
As || zf || = || f ||, the first equation in (# ) gives R (zf, 7g) = R (Cf, g). It gives 
immediately 8 (f, 9) = R (g, f) too, and these combined give 


Rf, 9) = Rif, ig) = R(—f, ig) = —RG, wg) = — Rw, PS). 


So (f, g) = (g,f), proving 23 (Definition 3). 21 and 23 (all in Definition 3) 
imply 21*, and thus (af, af) = | a (f, f). 

Now (#) gives R(f, f) = 4 (2-F |? — OF) = |SIZR CLA =2 
(+a) -F [2 - | —2-F |") = 0800, f) = If 2 This proves 24 (Defini- 
tion 3), and || f || = V(,f). 

Thus the proof is completed. 


4. The condition that every < 2-dimensional subspace L’ of L be isometric 
to a Euclidean space, is obviously necessary for the existence of an inner product 
in the generalized linear, metric space L. It is sufficient, too, because if it is 
fulfilled, we can argue as follows: If fo, go ¢ L the space L’ of all a-fo +8-90 (a, 8 
arbitrary complex numbers) is < 2 dimensional, thus (*) holds in L’ (as in every 
Euclidean space). Therefore it holds in particular for f = fo, g = go, and as 


fo, go are arbitrary, Theorem I proves the existence of an inner product. 


5. The following theorem holds for linear, metric spaces: 
TuHeoreM II. Let L be alinear, metric space. Define 


co, 1, liftolPt+is—ole 
fig 


= il? +19 le? for f,geL,notf =g =9 











INNER PRODUCTS IN LINEAR, METRIC SPACES 723 


and denote the l.u.b. of the C;,, by b, and their g.l.b. bya. Then we have 


IIA 
lA 


1 


IIA 


b 


lA 


4 a 2,a= 


ol 


The linear spaces with a (bilinear and symmetric) inner product represent the 
extreme case 


a=b=1. 
Proor: ClearlyO Sab. 22 (Definition 1) gives: 
1 2-(F Il + Ilg Il)? el 2-2-(\f [PF + Il g {|2) - 














Cro Sa° 25° n 
e=2 SIP +o lk 2 FI? +l gi? 
sob $2. 23 (Definition 1) for a = 2 gives: 
re = 4ie+ll2giP 5, Iflt?+iigi®? — 2 
oi~e 2° f+ol? +f -g IP If+ogl2?+1f—glP Cro’ 
soa = 4 Thus b S 2 implies a = > and a < bimplies a S 1 S |, together: 


$Sa51850682. 
That linear spaces with an inner product are characterized by a = b = 1 
is the statement of Theorem I. 


Roustock, GERMANY AND PRINCETON, N. J. 








ne ee ee 


ee ee en ee 


} 





TE 
t, 


a 


eS 


eeea eI See 











ANNALS OF MATHEMATICS 
Vol. 36, No. 3, July, 1935 


REMARKS TO MAURICE FRECHET’S ARTICLE “SUR LA DEFINITION 
AXIOMATIQUE D’UNE CLASSE D’ESPACE DISTANCIES VECTOR- 
IELLEMENT APPLICABLE SUR L’ESPACE DE HILBERT! 


By I. J. ScHOENBERG 


(Received April 16, 1935) 


1. Fréchet’s developments in the last section of his article suggest an elegant 
solution of the following problem. 
Let 


Qik = aki («~#k;7,k =0,1,---,n) 


be 4n(n + 1) given positive quantities. What are the necessary and sufficient 
conditions that they be the lengths of the edges of a n-simplex ApA, --- An? More 
general, what are the conditions that they be the lengths of the edges of a n-“‘simplex’” 
AoA, --- An lying in a euclidean space R, (1 S r S n) but not in a R,_4? 

This problem is fundamental in K. Menger’s metric investigation of euclidean 
spaces ((6] and [7], particularly his third fundamental theorem in [7], pp. 737- 
743). It was solved by Menger by means of equations and inequalities involv- 
ing certain determinants. Theorem 1 below furnishes a complete and inde- 
pendent solution of this problem. Theorem 2 solves the similar problem for 
spherical spaces previously treated by Menger’s methods by L. M. Blumenthal 
and G. A. Garrett ([1]) and Laura Klanfer ([5]); it may be conveniently applied 
(Theorems 3 and 3’) to prove and extend a theorem of K. Gédel ([4]). The 
method of Theorem 1 is finally applied to solve the corresponding problem for 
spaces with indefinite line element recently considered by A. Wald ((8]) and 
H.S. M. Coxeter and J. A. Todd ((2]). 


Construction of simplexes of given edges in euclidean spaces 


2. A complete answer to the questions stated above is given by the following 
theorem. . 

THEOREM 1. A necessary and sufficient condition that the a;x be the lengths of 
the edges of an n-“‘simplex”’ AgA; --- A, lying in R,, but not in R,-1, is that the 
quadratic form 





1 These Annals, vol. 36 (1935), pp. 705-718. 
? The quotation marks should indicate that the configuration may lie in a euclidean space 
of less than n dimensions. 


724 


SUBSPACES OF HILBERT SPACES 725 


n 


(1) F(a, 22, --- » Real = tm ai ; xi + > * (a3 ; + ai, = Qin) LX; 


t=1 t,k=1 


n 


3 bs (a5; + a5, — ai,) rity 


t,k=1 


(with ay = 0 if ‘= k) 


be positive, ie. always 2 0, and of rank r. 

The condition is necessary. Let AoA; --- A, be an n-“simplex”’ with A,;A, = 
ay. Let Ao = 0 be the origin of a R, in which A; has the cartesian coérdinates 
ai, @i2,°+*,@ine The point (in vector space notation) 


P = mA; + 4242+ +++ + tnAn = (&1, £2, -++ , En) 
has the coérdinates 


£& = Xa, + Loa + +--+ + LaQny (v=1,----,m), 








whence 
OP? ali [| P ||? => > # = ) i (x10, + ais + Lyn)? 
1 v=1 
= re xi > ai, +2 > UjiXK > QiyAky « 
t=1 v=1 t<k v=1 
Since 
L ai, = OA; =45,, 
v=1 
2 > Ajphkhy = Zz a’, _ > ai, seats po (ay - ayy)® = AA; + A,A? — A;Ai 
y=1 v=1 v=1 y= 
= 45; + a), — ix, 
we have 
(2) OP? = || mAi+ --- +2nAn||? = F(x, 22, -++ ,2n)- 


Hence F(x, +++ ,2n) is positive. It follows furthermore from our assumptions 
that P = 0, hence F = 0, on a linear manifold of n — r dimensions in the vari- 
ables 2, --- , 2n; hence F is of rank r. 

The condition is sufficient. Let us first assume F to be positive definite, ie. 
r=n. By means of a certain linear non-singular transformation 


(3) (y) = H(z) 


we get the identity 
(4) F(a, +++) =Yi tyr tes ty. 





: 
; 

















726 I. J. SCHOENBERG 


Call Ao the origin of the cartesian space of the variables (yj, --- , yn) and 
Ai, Ao, ---,An, 
the n points which in virtue of (3) correspond to 
(5) (ts, ta, +++ ,%u) = (1, 0,--+ ,0), (, 1, 0, --- ,0), ---, @, 0,--- ,0,1), 


respectively. Their y-coérdinates are readily found by (3). For their mutual 
distances we find by (3), (4) and (5), 


(i) 
AvAj = F(0,---,1,---,0) =a9;, 
(i) (k) 
A;Ai = F(0,---,1,---,—1,---,0) =a); +45, — (a5; + 45, —a3,) 


= ayy, (i <k)? 





which show that AoA; --- Anis precisely the n-simplex we are looking for. It is 
indeed an n-simplex because the points (5) are independent and (3) is non- 


singular. 
If r < n, then (4) has to be replaced by 
(6) F(a, +++, 2%) =Yi tyst--- ty. 


The above procedure gives an n-simples AoA; --- A», however the quantities 
F(1,0,---,0) =4aj,, F(1, —1,0,---,0) = aj, --- 


are no more the squared lengths of the edges AyA{, A,A3, --- , but, viewing (6), 
the squared lengths of their projections on the sub-space (yi, --- , yr), ie., on 
the manifold y,4; = --- = Yn =0. Hence the projection AjA; --- A, on this 
manifold of the n-simplex AoA; --- A, is an n-“simplex’’ of the type we are look- 
ing for, ie. with A;A; = ay. This n-“simplex” AjA; --- A, is by con- 
struction contained in a R, but not in a R,_;, as readily seen. 

Remark. If the matrix H of (3) is H = || hix || , then the y-codrdinates of the 
vertices A; and A; are 


A; —_ (hy i, hei, eee » Mets A; = (his, hei, a hi, 0, oer , 0) ° 


The actual construction (i.e. determination of the coérdinates of its vertices) of an 
n-“simplex”’ of edges ax is therefore carried out by a reduction of the quadratic 
form (1) to its canonical form (6). This is a problem of the second degree, for 
the transformation (3) is by no means required to be orthogonal. 

As an illustration of this method let us construct a regular n-simplex with 
ai. = 1. By (1) we have 


F(a, cee > Za) = > i 2? + - LiL, « 


i=] i<k 


SUBSPACES OF HILBERT SPACES 727 


The identity 
Ti+2 Ti+3 


: _ “t+1 ' Ti+t 2 
F(ty«++y%n) = Dy oy (w+ Zot 4 Mey Hey...) 


¢=1 








(x; = 0, if i> n) , 


shows that F is positive definite, hence the existence of our regular n-simplex 
isinsured. The coérdinates of the vertices of one such simplex may be read off 
from this last identity: one vertex is Ay = (0, --- , 0) while the coérdinates of 


A, (v = 1, +--+ ,) are 

n—y 

’ 
@&.--,@. 





1 1 1 ¥ 1 4/*! 
/2-1-2) VW2-2-3' V2:3-4 9 V2 — 1)’ Qv 


Construction of simplexes of given edges in spherical spaces 








3. Denote by S* the r-dimensional spherical space 
eit te tee) +i = 0 
immersed in a R,4;. The problem is as follows. 


(riven (3) positive quantities a (i ~ k;t,k = 1,2, ---,n) and a positive p, to 


decide whether there exist, on some S?, n points Aj, As, --+ ,An, such that their 
“~ 


spherical distances A;Ax = aix. 

According to a remark of J. von Neumann this problem may be reduced to 
the preceding one regarding the construction of simplexes in euclidean spaces.’ 
Combining his remark with Theorem 1 we get the following theorem which solves 


completely the problem stated above. 
THeoREM 2. Let ai, = ax: (i # kj; i,k = 1, 2,--- , n) be (5) given positive 


quantities. Necessary and sufficient conditions that there be, on some spherical 


manifold of radius p, n points Ay, Ao, --- , An, of mutual spherical distances equal 
oo 

to the ai, ie. A;A, = ajx, are the inequalities. 

(7) an S Tp, 


together with the condition that the quadratic form 


n 


(8) &(2, Ta, ***, 2.) = a cos (ax /p) XjXy, (ain = 0, if i = k) 


t,k=1 


be positive. If r(= 1) is the rank of ®, then we can find such points in S?_,, but 
not in S?_, (which is undefined if r = 1). 





* After Prof. von Neumann’s verbal communication I noticed that the same reduction 
has already been used by Laura Klanfer ([5]) to carry over Menger’s results from euclidean 
Spaces to spherical spaces. 














PE oe nn mag 


Soe 





728 I. J. SCHOENBERG 


The meaning of the inequalities (7) is obvious viewing the fact that no distance 
on a sphere of radius p can exceed zp. Suppose there are required points A,, 
,A,onsome S2(m = 1). Call Ao the sphere’s center. Then AoA, --- A,, 

is an n-“simplex”’ in F,,4:, the lengths of its edges being 


(9) AvA; = p = Oy, A:A = 2p sin >* = te Chm ljs-s meee. 
From Theorem 1 we know that the construction of such a “simplex” amounts to 
the investigation of the quadratic form 


ae i 
> (ab; = a, — a5,) tt, = p > (: - 2 sin® SH) 2 LiL; 


t,k=1 i,k=1 


n 


= p* :¥ cos (aix./p) Ujt, = p ®. 
t,k=1 
Its positivity is necessary and sufficient for the existence of AoA; --- A, with the 


properties (9). Its rank r indicates that AoA, --- A, is contained in R, but 
“~ 


not in R,-1, hence A,A;2 --- A, with the desired properties, i.e. A;Ax = ain, is 
contained in S?_, but not in S?_.. 


4. The set of quantities a;; in Theorem 2 could be thought of as the edges of an 
abstractly defined (n — 1)-simplex (in Menger’s terminology it is a semi-metric 
space composed of n — 1 points). Theorem 2 answers the question whether or 
not this abstract simplex can be immersed isometrically, i.e. by congruence, in a 
spherical space of given radius. 

An interesting consequence of Theorem 2 is the following theorem. 

THroreM 3. Let o,; be a(n — 1)-simplex of a S*°,; there exists a radius 
pi S po such that ¢,_, can be immersed isometrically in S*), 

Thus for n = 3 we get the following geometrically obvious statement :- Any 
ordinary spherical triangle of a S}° can be placed isometrically on a circumference 
of suitable radius p; S po. 

We note first that if ¢,; can be immersed in S?°,, which happens when the 
rank of 


(10) ®(x; p) = > cos (aix/p) rity 


i,k=1 


is S n — 1 for p = py, our theorem is proved with p; = po. Let us now assume 
(x; po) to be of rank n, hence 


= po; 





(x; po) positiv 


by Theorem 2. Note that (x; p) can not be positive definite for all p with 
0 < p S py, for it fails to be so if e.g. p = ay2/z since the first principal minor of 





SUBSPACES OF HILBERT SPACES 729 


order 2 of the discriminant of (x; a2/7) vanishes. Call p; the greatest lower 
bound of the values o with the property that © (z; p) is positive definite if 
¢<p<po. Byaprevious remark necessarily 


(11) ak S pi. 


Now ®(z; p) can not be positive definite if p = p; for it would still be so (by 
continuity) for all values p sufficiently close to p; in contradiction to the definition 
of p. But (2; p:) is necessarily positive, as the limit of positive definite forms 
(x ;p),forp—pi+0. Hence (7; p;) is positive and of rank <n. Now the 
proof is completed by (11) and Theorem 2.‘ 


5. We shall now extend Theorem 3 to cover the case when pp = ~, that is 
when on-1 isin Rai. We assume o,_1, of edges aix, to be a (n — 1)-simplex of 


Risto i.e. 


(12) 1 > (a3; + ai, — a%,) x2, positive definite. 


t,k=2 
Let us prove that on, can be immersd isometrically in S*_,, provided p is sufh- 
ciently large. This is proved if we can show that 


n 


®(r;p) = >> cos (ax/p) rir 


i,k=1 
is positive definite if p is sufficiently large. A well known criterion states that a 


quadratic form is positive definite if and only if all the n principal minors of its 
discriminant chosen as follows 








are positive (see Dickson [3], §40). If in the matrix of coefficients 





a 
1 cos —* 
p 
(i,k = 2, +++, Mm) 
a; a; 
cos — cos — 
p p 


of (x; p) we subtract the first line from all the other lines and then the first 
column from all the other columns we get the symmetric matrix 


a 
1 “a = | 
p 
(13) | 
OQ; Qj Qj a 
cos — — cos — — cos — — cos — + 1.] 
p p p p i 
ictecicaicatemeileciese mais 
* Note that p = p, is the first value < po which is a root of the transcendental equation 
det | COS (aik/p) || = 0. It would be interesting to decide whether o = p, is necessarily a 


simple root of this equation. 





i 


RNC nimmeane—te 





: 














730 I. J. SCHOENBERG 


which, as a result of the above criterion, will be the matrix of a positive definite 
form if and only if & (2; p) is positive definite itself. Noting that (13) can be 
written as follows 


Be) 
2p? pt 
1 1 py) 
a; 2 
~ $1 +05) pa lels tah, — ats) + 0(3) 


we see that the v** (vy > 1) principal minor of (13) is = to p~*’ times the (» — 1) 
principal minor of the discriminant of (12), plus a remainder O(p-*”). By (12) 
all these minors are positive if p is sufficiently large, hence ®(; p) is positive 
definite and o,_; can be immersed in S?_,. For any such p = pp. Theorem 3 
proves the existence of S;,..., with p: < po, in which c,_: can be immersed. We 
have thus proved the following 

TuHeEoREM 3’ (of Gédel). Jf on, is a n-simplex of R,, then there always exists a 
S’_, in which o, can be immersed isometrically.® 


The case of indefinite spaces 


6. Consider the space of real variables (y, --- , y¥m) with the property that 
the square of the distance PP’ of two points is given by the formula 


™m 


PP” = PP ey (y, = y,)?, 


v=1 
with « = +lfory = 1,---,p,q4 = —lfory = p+1,---,p+q(=™m). 
We denote this space by R,,,; thus Rm = Rm,o. The linear geometry of 
R,,q is obviously the same as that of Rp4, = Rm. 

Let now 3n(n + 1) real numbers cix(ci; = 0, cx = Cxi3 1, k = 0, --- , n) be 
given. Are there n + 1 points Ao, A, --- , A, in some space R,,, such that 
A;Ai = Cix, and what is the space R,,, of the least number of dimensions in 
which there are such points? A complete answer is furnished by the following 
theorem. 

THEOREM 1’. Consider the quadratic form 


n 


(14) F(a, Yess, Tn) = 3 po (Co; + Con — Ciz) UiLy 


t,k=1 





5 A heuristic proof of this theorem for n = 3 is as follows. Think of the edges of o3 to be 
made of flexible strings; place in the interior of ¢3 a small sphere which is gradually inflated. 
This sphere will reach a certain definite size when it will become tightly packed within the 
6 strings (edges) of o;. Note that in the rigorous proof above a very large sphere was used 
which was gradually deflated to its proper size. 








a a 





SUBSPACES OF HILBERT SPACES 731 


Let it be of type (p, q).© The necessary and sufficient conditions that there be n + 1 
points Ao, Ai, +++ » An in Ry,q with A;Aj = Cix, are the inequalities 
p’ = P, q’ & ¢- 
Thus Rp, q 1s the least space in which there are such points. 
The condition is necessary. Let the points Ag = 0, Ai, --- , An in Ry, have 
the required property and let R,,, be the least linear subspace containing these 


points. We know thatp S p’,qSq,pt+qsn. Letp+q = mand let 
A; = (au, +++ » im) be the coérdinates of A; in R,,, with respect to an orthog- 


onal coérdinate system. For the point 
P = 2A; + --- + teAn = (8, +--+ , Em) 


of coordinates & = tia, + --- + 2,a,, we find as in section 2 the identity 
OP? ad > 6 €? = : €, (2104, a ee LnQny)? = F(a, S06 » Za) ‘ 
v=1 v=1 


Viewing our assumption that the matrix of the a,, is of rank m and the law of 
inertia (Dickson, [3], p. 72), we see that F(x) is of type (p, q). 

The condition is sufficient. Assume first p+ q =n. By anon-singular trans- 
formation 


(3’) (y) = H(z) 
we get the identity 
F(a, +++, 2a.) = Yi + +++ +95 — Yeti — 1 — Uae 
Consider in the space R,,, of the variables (y,---,yn) the points whose 





z-coordinates are given by (5). We find as in section 2 A;A; = ci and the 
theorem is proved, for R,,, can be considered as a subspace of R,,,,’, if p’ = p, 
q 24%. 

Ifp+q=m < n, then we get 


F(a, +++ tn) = Yi +--+ +95 — Yeti — *°* — Um 
To get the desired points we have to project the points Ao, --- , A, on the mani- 
fold yma: = +--+ = Yn = 0, which isa R,, ¢. 


7. It should be remarked that F defined by (14) is the most general real 
quadratic form in n variables. We thus have the following 
Corotzary. Let 


(15) F= } % bi LiL, 
1 





* That is of index pand rank p+. See Dickson [3], p. 71. 


tn net ete 


} 
’ 
: 
i} 
: 





\ 
13 
| 











732 I. J. SCHOENBERG 


be a non-degenerate real quadratic form of type (p,q). If by means of 


(3””) (y) = H(z) 
we have 
(16) Fn oh Se Ss = gy 
then the columns of the matrix 
hy +++ Ain 
H=|: 
har a Rue 


are the y-coérdinates in Ry,q of n points Ay, --- , An, which together with Ay = (0) 
have the property A;Ai = Cix, where 


Cor = bi, Cu = O55 + Dex — Wie (i,k >0). 





A geometric interpretation of the reduction of (15) to the canonical form (16) 
by means of an orthogonal linear transformation is well known from the theory of 
quadrics. The above Corollary furnishes a geometric interpretation of this 
reduction by any linear non-singular transformation. 

Probably the most concise description of the result of Theorems 1 and 1’ is as 
follows. If the squares of the edges of a simplex AoA; --- A, are given real 
numbers, A;A; = cx, then this defines uniquely a (indefinite) space which, if 
referred to the coérdinate unit-vectors AoA, AoA, --- , AoAn, has the line 
element 


n 


ds? = 3 SD) (co; + Cor — Ci) Tite. 


i,k=1 
SWARTHMORE COLLEGE, SWARTHMORE, Pa. 


REFERENCES 


{1] L. M. BuumentHat ano G. A. Garrett: Characterization of spherical and pseudo- 
spherical sets of points, American J. of Math. vol. 53 (1931), pp. 619-640. 

[2] H. S. M. Coxeter ann J. A. Topp: On points with arbitrarily assigned mutual distances, 
Proc. of the Cambridge Phil. Soc., vol. 30 (1934), pp. 1-3. 

{3] L. E. Dicxson: Modern algebraic theories, Sanborn Co., Chicago, 1926. 

[4] K. Gépeu: Uber die metrische Einbettbarkeit der Quadrupel des R; in Kugelflachen, 
Ergebnisse eines math. Kolloquiums, Leipzig und Berlin, Heft 4 (1933), pp. 16-17. 

{5] L. Kuanrer: Metrische Charakterisierung der Kugel (Vienna Dissertation), Ergebnisse 
eines math. Koll., Heft 4 (1933), pp. 43-45. 

{6] K. Mencer: Bericht tiber metrische Geometrie, Jahresber. der deutschen Math.-Ver., 
vol. 40 (1931), pp. 201-219. 

(7] K. Mencer: New foundations of euclidean geometry, American J. of Math., vol. 53 
(1931), pp. 721-745. 

{8] A. Wap: Kompleze und indefinite Riume, Ergebnisse eines math. Koll., Heft 5 (1933), 
pp. 32-42. 





ANNALS OF MATHEMATICS 
Vol. 36, No. 3, July, 1935 


A REVISION OF THE ALGEBRA OF LUCAS FUNCTIONS 
By E. T. Bruu 
(Received January 26, 1935) 


1. The functions U, V. Let a, 8, z be independent complex variables. 
Write 
b=a+8, c = af, A = A(b, c) = 4c — B?. 
Unless otherwise indicated, principal values of irrationalities are to be under- 
stood. The functions U, V are defined by 


vacant vaecduww +P: 
a—f8B 

Lucas! defined these functions for b, c rational and z integral. The above 
definitions lead to a considerable simplification of the algebraic part of Lucas’ 
theory and supply many new identities between the functions of the kind from 
which Lucas deduced important arithmetical theorems concerning integers 
forming recurring series of the second order, for example, what he called the law 
of repetition of primes in such series. The case b, c rational and z integral is of 
course included in the general. The isomorphism in §2 is also a first step toward 
constructing a theory similar to Lucas’ for the polynomials in sn z, en x occurring 
in the theory of real multiplication of elliptic functions.’ 

We may take 


2a=b+iA” W=b-iA”, a—B=iA?, 
since changing the sign of 7 interchanges a, 6 but does not affect U, V. 
Understanding the variables b, c, we shall write 
U(b, c, z) = U(z), V(b, c, z) = Vz) 
when convenient. From the definitions we have 
U(—z) = —c*U(z), V(—z2) = eV); 
W(z + 2) — bW(z + 1) + cW(z) = 0,7 





1 E. Lucas, American Journal of Mathematics, vol. 1 (1878), pp. 184-240, 289-321. Many 
expansions and identities for these functions appeared in the writings of Cauchy and others 
prior to Lucas. 

* The means by which this can be done will be discussed elsewhere: an isomorphism 
L—C—S-— E, where L, C, S, E denote the theories of the generalized Lucas functions of 
the present paper, the circular functions, the functions of angles occurring in spherical 
trigonometry, and elliptic (or elliptic theta) functions, provides the necessary apparatus. 
This is probably not what Lucas had in mind in the first paragraph of his memoir, as it can 
not be extended to abelian functions. 

733 


bl 
£ 
} 





ame © 














734 E. T. BELL 


where W denotes either U or V; 
U(0) = 0, U(1) = 1; V(0O) = 2, V(1) = b. 


2. An isomorphism. In a paper to appear in the American Journal, I have 
established the following isomorphism between U, V and sin, cos. It is much 
easier to apply than the isomorphism of Lucas (loc. cit. p. 189), and it differs in 
principle from his (there is not, at any stage, equality, for any argument, between 
U(z), V(z) and the corresponding circular functions, but merely zsomorphism). 

Let m, --- , ms be integers, 4, --- , 2, independent variables, real or complex, 
and 21, --+ , 2,independent complex variables. The isomorphism is summed up 
in the following table, in which ~ is the sign of 1, 1 correspondence. 


T~ 2, neyr~ne, (G=1,---,8); 
E+ ---+ 24, 2=4+-:---+42;, 
A=A (b,c), U(z) = U(b, ¢, z), V(z) = V(b, ¢, 2) ; 
sinz ~ 3c*? AX? U(z), cosxz ~ 3c%? V(z), 
U(z) ~ 2c7/2A-12 sin 2, V(z) ~ 2c*? cos x ; 
= M+ --- + Ne, = M121 + +++ H+ Nhs 5 
sin t~ eS? AMZ U(S), cost ~ 9c tH? VE), 
U(g) ~ ck? A-w2 sin &, V(g) ~ 2ch? cos &. 


This contains several redundancies, but it has been stated in the above form 
for clearness. It is applied as follows. Let ¢;; denote a definite one of 0, 1, and 


for a particular J, let 21;, --- , x.; be independent real or complex variables, and 
Z1j, -++ , 2, independent complex variables. Write 
ty = De eit 23 = DO erieri G=1,---,p+4). 
r= r=] 


Note that the z,;, 2, in x;, x, respectively are not necessarily all distinct when 
j # k, and similarly for z;,z,. Write 
X(e) = 4e"A® Ue), — ¥(@) = fe" Vg), 
S(z;i, 2:1) = 2c#i/?A-"? sin x, C(z;, 23) = 2c*i!? cos 2; , 
U(z;) = U(b,¢, 2;), V(z;) = V(b, c, 2), A=A(b,c). 
Then the isomorphism states that a finite relation of the form 
I(sin 2, --+ , Sin Xp, COS Xpis, +++ , COS Lpyq) = O 
implies 


I(X(@1), +++ , Xp), Y(@pt1), +++, Y(Zp4a)) = 0; 








ALGEBRA OF LUCAS FUNCTIONS 735 


and conversely, a finite relation of the form 
I(U(1), «++ , U@>), V@r41), +> » Vepta)) = 0 
implies 
I(S(21, £1), +++ » Spy Lp), C(Zpray Lpti), ++ » C(Zprg, Lprq)) =O. 


In the converse it is evident that when J = 0 is reduced, b, c must disappear, by 
cancellation or otherwise, from the trigonometric J, since the 2,;, 2;; are not 
connected. We shall refer to the J with U, V asa Lucas J. 

The second part of the isomorphism, concerning £, {, is applied in the same 
way, by substituting n,;2,;, 2,;2r, for 2,;, 2; (7 = 1, --- , 8), where the n,; are in- 
tegers. In discussing the consequences of the isomorphism we may confine our 
attention to the first part, since the second is obtained from the first by the 
transformation 


x— &, z— ff, Lei > NyjLrj, Zerg —> Nrihe; . 


One or two examples will suffice. Let x,z be asin the table. Then, from the 
isomorphism applied to 


sin? x + cos? x = 1, we get 
A U%(z) + V2(z) = 4c’. 


Let 21, 22 be as above. Corresponding to the addition theorems for sin, cos we 
get at once from the isomorphism 


*2U (a + 22) = Ula) V(z2) + Ve) U(22), 
2V (2: + 22) = V(a) V(ze) — A Ula) U(ee). 
It will be seen in §4 that either of these implies the other immediately. 
3. Invariance. Let w be a complex variable. Under the transformation 
a— a’, B— B”, we have, from the definitions in §1, 
b— V(w), cc", A—AU*(w) 
U(z) — U(ew)/U(w), V(z) — View), 
withz = 2+ .-.+2,. Hence 
de-#/2AU2U (z) — $c 2A?2 U (ew) , 
4¢-#l2V (z) — $c-™?V (ew) . 


The transformed members in the last could have been obtained from the originals 
by z — zw or z; > wz; (j = 1,---,8). The z; are independent complex vari- 
ables. Hence the z; = wz; are independent complex variables, and the trans- 
formation a — a”, 8 —> 6” accomplishes merely a change of notation from 




















736 E. T. BELL 


variables z; to variables z; in any Lucas J relation (as in §2). Hence such rela- 
tions are invariant under the transformation. Since the second part of the iso- 
morphism is implied by the first, there is a like invariance for functions U(¢), 
V(¢). 

In Lucas’ theory the corresponding transformation (loc. cit. p. 189) alters 
U, V relations and generalizes them. Here the relations are obtained at once 
in their general, invariant form. 


4. Transformation. From any Lucas J relation (as in §2) many more can 
be written down immediately by transformations of the functions U, V. The 
method of finding transformations changing a given Lucas J relation into another 
is in all cases the same: the trigonometric correspondent of the given relation is 
transformed, by transforming the sines and cosines, and this induces a trans- 
formation on the U, V, by means of the isomorphism. The transformed trig- 
onometric correspondent is implied by the original. Hence, if in the original 
Lucas I relation the transformed U, V be substituted for U, V, the relation thus 
obtained is implied by the original U, V relation. Instead of transforming the 
variables in the trigonometric correspondent, we may operate on the sines and 
cosines, by differentiation, etc., and proceed as before, substituting directly 
from the transformed isomorphism into the Lucas J relation the induced trans- 
forms of the U, V. 

Let us examine first the correspondent for Lucas J relations of the periodicity 
of the circular functions. The notation is as in the isomorphism, 


zr=a+---4+ 2s, Z2=4+---+2s., 


Let 
4;— 1/2 — 2%, Le —XE (k ¥ j). 
Then x — 7/2 —z, and 
sin 2 — cos 2, cos x — sin z, 
sin 2; — COS 2;, cos x; — sin 2;, 


sin 7, — —sin %,, COS 1, — COS 7. 
Applying these to the isomorphism we have, for example, 
$c-#/2 V(z) ~ cosz > sina ~ 4c77/2A"2U (z) , 


and similarly for the rest. Hence form any U, V relation in z we can write down 
another according to the following transformation, 


U(z) — A V(z), V(z) — A’? U(z) , 
U(z;) — Ave V(z;), V(z;) — Al? U(z;) ° 
U(z%) Bi U (zx), V (zx) => V(z%) (k # j) , 








ALGEBRA OF LUCAS FUNCTIONS 737 


As an example of the application of this transformation, it is verified by inspec- 
tion that either of the addition theorems in §2 is transformed into the other. 


Since 
U(—z) = —c*U(z), V(—z) = cV(z), 
it follows that the four addition and subtraction theorems for U, V are implied 


by any one of them, precisely as in trigonometry from the periodicity and parity 


of the sine and cosine. 
In the same way we obtain the following transformation, which is useful in 


the summation of series of functions U, V. Let w be a complex variable and n 
an integer. Then from any U, V relation we can obtain another by the trans- 


formation 
U(2nw) — (—1)""'U(2nw), 
V(2nw) — (—1)"V(2nw), 
U((2n — 1)w) — (—1)"""A-"?V ((2n — 1)w), 
V((2n — 1)w) — (—1)"""A"?U((2n — 1)w), 
which we shall call the parity transformation. The generalization to functions 
U(¢), V(¢), where ¢ is as in §2, is obvious. 
With z, z as before, x — —x induces 
U(z) > —U(@), Vz) — Ve); 
and (n is an integer) x — 7 — x induces 
U(nz) — (—1)""U (nz), V(nz) — (—1)"V(nz) . 
Consider a Lucas J relation in which J is a sum of terms of the form 
Te) =k YT (U(me) TT (Vena), 
where k is independent of z, and the m;, r;, nj, s; are integers. Write 
p q 
M(z) = 2 rons aes _ 2 $n; ee : 


t=1 
Then in the I relation we may apply the transformation 


T(z) — T(z) M(z) 





to the term 7'(z), and a similar transformation to each term, to get a new Lucas J 
relation. This follows in the same way as before, from the transformation 
induced by differentiation with respect to x of the trigonometric relation cor- 
responding to J. The transformation is applicable if the r, s are rational 
numbers. Integration with respect to x induces a transformation which is the 
inverse of the above. 








il S TRE Nk I eae ape on 





738 E. T. BELL 


The transformation T(z) — T(z) M(z) will be called derivation, and the 
resulting relation the derivate of the original. The special case p = g = 1 is 
particularly useful: 


V"(nz) U*(mz) — —rnA?V"-"(nz) U(nz) U*(mz) 
+ sm A~-!¥?2V"(nz)U*""(mz) V (mz). 
Hence, taking r = 0, s = 0 in turn, we have the useful transformation 
U*(mz) — sm A-¥2U*-"(mz) V(mz) , 
V'(nz) > —rn A¥?2V"-"(nz) U(nz) . 


In a properly restricted region U, V are analytic functions of b, c. Hence 
differentiation with respect to b, c furnishes further relations. The necessary 
formulas are easily calculated: 











aU(z) dA _ 
A ab = bU(z) _ zV(z) ) ab — ’ 
a QU) _ oV(e— 1) — 206), ee 
dc 0c 
aV(z) aV(z) _ 
—* zU(z), gl clive 22U(z — 1). 


5. Multiplication and division. Let z be a complex variable, n an integer 
> 0. The problem of multiplication is to express U(nz), V(nz) as functions of 
U(z), V(z). It is solved by inspection on applying the isomorphism to the 
formulas for sin nz, cos nz. Thus from the pair of formulas, obtained from 
Demoivre’s theorem, 


[(n—1)/2] 
; n ; 
sin nt = > tle ( ) cos*-**—! x gin**+! z , 


2;+ 1 
s=0 
[n/2] ss 
cos NX = Zz (— 1)? oO cos"—** x sin” x, 
s=0 


in which [y] denotes the greatest integer in y, we have 


[(n—1)/2] 
2"—1U (nz) ini oH fo »*( n a U2 (2) Vr (z) , 


a 28+ 1 
[n/2] 
2-1V(nz) = >) (— 1) (3) A'U%(2) Ve-™(2) . 
s=0 


The parity transformation merely interchanges these two, as it must from the 
effect of x — 1/2 — x on Demoivre’s theorem. To express V(nz) as a poly- 
nomial in V(z) alone we apply the isomorphism to 








‘ 
: 


. 


ALGEBRA OF LUCAS FUNCTIONS 739 





[n/2] 
Y : n n—s8 
nrn—8 


s=0 


V(nz) = > “(" 


s=0 





‘) ce V"-*(z) ; 


Derivation (§4) of this gives 


[(n—1)/2] 
U(ne) = Ue) Dy (—e("— 28> ew vem), 


s=0 


From each of the last we find two more by the parity transformation: 


s 


(— 1)™U((2n — 1)z) = 3 "ae yet ' es i ‘) 


s=0 


ct Arnel [/2n-2—1(2) : 


(— 1)""U(2nz) = V(2) >> (— 1) ? seshingea t) ex Art YH) , 


(— 1)""V((2n — Iz) = V(e) Ss (— 1) e ¥ x a *) egret Oey. 


These can easily be rewritten in ascending powers of U, V if desired. 

The problem of division is to express U(z/n), V(z/n), where n (without loss of 
generality) is an integer greater than zero, in terms of U(z), V(z). If in the 
preceding formulas for U(nz), V(nz) we transform by z — z/n, we get equations 
of degree n to solve for V(z/n) in terms of V(z), U(z) respectively. Similarly, 
by 2 2z/n, equations of degrees 2n, 2n — 1 are obtained for expressing U(z/n) in 
terms of V(2z) and of U(2z), V(z) respectively; by z — z/(2n — 1), equations of 
degrees 2n — 1, 2n — 2 express U(z/(2n — 1)) in terms of U(z) and of U(z), 
V(z) respectively. Finally, we may use 


U(2z) = U(z)V(z), V(2z) = 2c? — AU%(z) = V2(z) — 2c’. 


Lucas alluded to the problem of division but did not discuss it. For n = 2 
we get 





U(e/2) = A-¥? [2c*? — V(z)}"?, — V(z/2) = (2c? + Ve}. 





* The summation formulas, loc. cit., p. 202, also the formulas (52), have no meaning as 
they involve Lucas’ functions U, V with fractional indices, and these are undefined in his 
theory. A correct interpretation can be given these formulas by using the results for 
division by 2 given here. 

















ESSE ee RE Rs 


740 E. T. BELL 


It may be pointed out that Lucas’ equality form of the isomorphism (loc. cit. 
p. 189), combined with the present formulas for division, gives a complete 
explicit solution of the problem of division, since the corresponding trigonometric 
problem is completely solvable. The functions U(z), V(z) appearing in this 
(even for b, c, rational and z integral) are not included in Lucas’ definitions of 


U, V. 


6. Summations. From the sum formulas for U, V whose arguments are 
in arithmetical progression, or from sums of like powers of such U, V, arith- 
metical theorems of some interest can be read off by inspection. We shall give a 
very general sum formula, including as special cases many of Lucas’. The 
possibilities being unlimited, there is no point in trying to be exhaustive. The 
trigonometric formula giving the sum by the isomorphism is not one of the 
standard formulas in the texts, but may be proved by a short calculation on 
using the exponential forms of the trigonometric functions. For r = 1 it checks 
with a known result. We have, for r a positive integer, x a complex variable, 


2D) 2 cos" (a + (3 — 1)8) = 3 {1+ (- 1} ( / ) eet 


a, r/2)/1—z2z 
[(r—1)/2] : 
+2 >) (5) MD, 
7=0 


N; = cos (r — 2j) a — x cos {(r — 27) (a — B)} — x" cos {(r — 27) (a + nB)} 
+ x"*1 cos {(r — 2j) (a + (m — 1)8)}, 
D; = 1 — 2x cos (r — 27)8 + 2?. 


Let z, w, u be independent complex variables. In the isomorphism let 
a ~ 2,8 ~ w, and in the isomorph let x — c™/2u. Then 


S) w'Ve + (s — Dw) = #1 + (— 1) (,/2) & 


s=1 


(1 saad uncnre!2) 
1 — uc/2 


[((r—1)/2] . 
+ > (jo auB, 


j=e 





A; = V((r — 2j)z) — uc? V((r — 23) (2 — w)) — uci” V((r — 27) (2 + nw)) 
+ yrrh crtin—-Dpw V(r _ 27) (z a (n _ 1)w)), 
B; = 1 — uc” V((r — 2j)w) + we. 


When u = 1, b, c are rational and z, w integral, this is an analogue for the 
rational numbers defined by one of the fundamental recurring series of the 
second order, of the summation formula for like powers of integers in arithmetical 
progression by means of Bernoulli polynomials. From the above we can deduce 


ALGEBRA OF LUCAS FUNCTIONS 741 


a chain of results by the transformations in §4, or by operations on u. It will be 
sufficient to state the parity transformations for functions of two variables. In 
what follows, m, n are arbitrary integers. After application of the following, 
we may apply w — —w to reduce the arguments of the summands to the form 


z+(s— l)w. 
The parity transformations with respect to z give 


U(2mz + nw) — (—1)™"1 ce U(2mz — nw), 
U((2m — l)z + nw) — (—1)™' c™ A“? V((2m — 1)z — nw), 
V(2mz + nw) — (—1)™c™ V(2mz — nw), 
V((2m — 1)z + nw) > (—1)™" c™ A’? U((2m — 1)z — nw) ; 
with respect to w, 
U(mz + 2nw) — (—1)" 2” U(mz — 2nw), 
U(mz + (2n — 1)w) — (—1)*"1 c@n- A-¥2 Vimz — (2n — 1)w), 
V(mz + 2nw) — (—1)" c?™ V(mz — 2nw), 
V(mz + (2n — 1)w) > (—1) c@-” Al? U(mz — (2n — 1)w) ; 
and with respect to both z and w, 
U(mz + nw) > (—1)'™*”?2 U(mz + nw), 
V(mz + nw) — (—1)(™”?? Vimz + nw) 
if m + nis even, while m + n is odd, 
U(mz + nw) — (—1)(™t-)/2Q-12 V(mz + nw), 
V(mz + nw) — (—1) ‘™*"-P?A12U (mz + nw). 


In applying any one of these three transformations to the above sum formula 
it will be found convenient to separate the cases r even, r odd. 

From the formulas expressing sin" x as a sum of sines or cosines of multiples 
of x, where n is a positive integer, the isomorphism gives 


(ayaa) = 5S (—1(") e* viatn — 92) + (-(*"), 


s=0 


§ 


(—A)*1U2"-1(z) = p> (—1) ¥ na ') c* V((2n — 2s — 1)z), 


s=0 


from which the corresponding formulas for V"(z) can be written down. 


7. Further tranformations. As in §2, we may replace a, 8 by functions 
of themselves, or of other variables, and note the effect on U(b, c, z), V(0, ¢, z). 














5%, 


742 E. T. BELL 


In this way an unlimited number of relations involving functions U(},, ¢;, w;), 
V(b;, ¢;, w;) (G = 1, 2, --- ) can be written down from any relation involving 
U(b, c, 2), V(b, c, 2). For we have, with the notation of §1, 


2a? = V(z) +7 Al? U(2), 267 = V(z) — 7 AY? U(z) 
for any b, c and the corresponding a, 8. Consider the functions U, V for 
(b, c, a, B) = (bj, cj, a, B;). 
Under the transformation 
a — f(a, Bi, «++ , a, Bs) = f, B > g(a, Bi, +++ , &%, B) =Q 

we have 

bof+g, cfg, tAV—of—g, 

UZ -(F-gF)V/F-9, V@-f+¢. 


We seek such f, g as give these transforms in closed form in terms of powers of 
the a;, 8; of the type a;?“, 8,47", where p;, g; are constants and u;, v; complex 
variables. For such f, g we can then express the transforms of U(z), V(z) in 
closed torm in terms of functions W(p;u;), W(q;u;) (W = U, V), by means of 
the formulas corresponding to those for 2a*, 267 above. One pair f, g suitable 
for this is 
p= Taye, g = TT am ay, 
j™=1 j™=1 


in which the p, q, 7, ¢ are constants and the u, v, x, y complex variables, not 
necessarily independent. The invariant transformation in §2 is a special case 
of this. 


CALIFORNIA INSTITUTE OF TECHNOLOGY. 





> ag 


a ey ghee ae 





ANNALS OF MATHEMATICS 
Vol. 36, No. 3, July, 1935 


COMBINATORIAL RELATIONS IN PROJECTIVE GEOMETRIES 
By GARRETT BIRKHOFF 


(Received February 1, 1934, Revised December 27, 1934) 


1. Introduction. The following note can be interpreted in various ways. 
The obvious interpretation is that projective geometries have an alternative 
characterization in terms of elementary combinatorial operations and a finiteness 
condition. 

More remote from current notions is the conclusion that projective geometries 
can be correlated in a perfectly definite scheme’ with (1) Boolean algebras (2) 
‘fields’ of point-sets (3) “rings” of point-sets (4) systems of normal subgroups 
(5) systems of ideals (6) modules of a ‘‘modular’’ space? (7) systems of sub- 
algebras of abstract algebras. 

Of theoretical interest also is the correlation sketched in §§7-8 between the 
systems which we define and the theory of the reduction of group representa- 
tions, of compact Lie groups, and of semi-simple hypercomplex algebras. 


2. Projective geometries as complemented modular lattices. Let V be 
any n-dimensional vector space with coérdinates in a number-field F. A set L 
of vectors &, n, ¢, --- of V is called a “‘vector subspace”’ if and only if (1) Ee L 
and ye Limply (é + ») eL (2) ¢eLand2zeF imply (z- £) € L. 

Let LZ and M be any two vector subspaces of V. The set L ~M of vectors 
common to L and to M is a vector subspace of V, and so is the set L ~M of all 
sums § + 7 of a vector ¢e L anda vector 7 eM. It is easy to prove—and this 
will be done under more general hypotheses in proving Theorem 1—that 


L2: LAM = MALandLWeSM = M VTL. 

L3: LD A(M AN) = (LAM) AN andL J (M CN) = (LYM) ON. 

L4: LA(M CL) = LU(M AL) = L. 

L5: If Z is contained in N [in symbols, if L C N}, then LUO(MAN) = 
(LM) AN. 

L7: To any L corresponds a “complement” L’ satisfying L AL’ .M = LAL’ 
and LL’ M = LWL’ irrespective of M. 





1 Each of the seven families listed is closely tied up with the set of all combinatorial 
systems satisfying L2-L4 and appropriate auxiliary conditions such as L5, L7, and the 
distributive law (L6). 

? T.e., subspaces of a modified vector space whose coefficients are merely the numbers of a 
ring of integrity. Cf. H. Grell, ‘“‘Bezichungen zwischen den Idealen verschiedener Ringe,”’ 
Math. Ann. 97 (1926), 490-523. 

* A less concise but more understandable equivalent for L7 is: ‘‘The system contains a 
least element (=subspace) O such that OAL = O for any L, and a greatest element Q such 
that QM = Qforany M. And to any L corresponds at least one L’ such that LAL’ = O 
and LVL’ = Q.” 


743 

















744 GARRETT BIRKHOFF 


The vector subspaces of V may also be regarded as elements of a “projective 
geometry” P(V), and as such have a number of familiar geometrical properties, 
For instance, if we call the one-dimensional subspaces of V “points,” the two- 
dimensional subspaces “lines,” etc., then 


P1: Two distinct points are contained in one and only one line. 

P2: If A, B, C are points not all on the same line, and D and E (D # E) are 
points such that B, C, D are on a line and C, A, E are ona line, then there 
is a point F such that A, B, F are on a line and also D, E, F are ona line. 

P3: Every line contains at least three points.‘ 

P4: The points on lines through any k-dimensional’ element and a fixed point 
not on the element are a (k + 1)-dimensional element, and every (k + 1)- 
dimensional element can be defined in this way. 


By a “projective geometry” is meant® any abstract system which shares with 
P(V) properties P1—P4, and in which there is a finite upper bound to the dimen- 
sions of the elements. 

Moreover any projective geometry satisfies L2-L7—after intersection and 
conjunction (“meet”’ and “join’’) have been defined in the obvious manner. 
In fact, L5 and L7 are the only properties whose truth is not evident. 

To prove L5, observe that (1) since LM and N each contain both L and 
M AN, L—(M AN) C(L -M) AN, and (2) counting dimensions’ 


dim L (MAN) = dim L + dim MAN — dim LAMAN 
= dim L + dim M + dim N — dim M_N — dim LAM 
= dim LM + dim N — dim L-MTCN 
= dim (LUM) AN 


To prove L7, let L be any k-dimensional element of a projective geometry. 
By P4 we can find plonts B,, --- , B,-, such that BusEFLOBiT --- B,. 
Set L’= ByU --- CBr». By P4, LL’ is the whole space; hence 


dim L AL’ = dim L + dim L’ — dim LIL’ =k + (n—k) —n=0 


and L AL’ is empty. 17 is now obvious. 

Hence if we define any system with two binary operations satisfying L2-L5 
and L7 as a “complemented modular lattice,’ then 

THEOREM 1: Any projective geometry P determines a complemented modular 





* Unless F contains only the numbers 0 and 1. 

> To preserve the correspondence with dimensionality in the primitive vector space, and 
for other reasons, we shall term what is usually called a “‘k-space’’ a ‘‘(k + 1)-dimensional 
element.”’ 

® QO. Veblen and J. W. Young, ‘‘Projective geometry,’’ Boston, 1910. 

7 We use Theorems S,2 and S,3 of Veblen and Young, Vol. I, pp. 32-3, which show that 
dim Z + dim M = dim LAM + dim L~M. 





COMBINATORIAL RELATIONS IN PROJECTIVE GEOMETRIES 745 


lattice C(P) by a transformation under which (1) distinct elements (points, lines, 
planes, etc.) of P go over into distinct elements of C(P) (2) intersections go over into 
“meets,” and conjunctions into “joins.” 


3. Covering and dimensions in modular lattices. This section and the next 
three will be devoted to proving a converse of Theorem 1. 

First, let C be any modular lattice—i.e., any system with two binary opera- 
tions satisfying L2-L5. An element a of C is said to “properly contain’’ an 
element b of C [in symbols, a > 6] if and only if aDb—i.e., ab = a—and 
ab. The element a is said to “cover’’ b if and only ifa > b, anda >c > b 
has no solution—i.e., if and only if a is a minimal element properly containing b. 

If there is a finite number n such that every sequence a; > az > a; > --- of 
decreasing elements of C contains at most n terms, then C is said to be of “finite 
dimensions,”’ and it has been proved by Dedekind® that under these circum- 
stances 

(3.1) C has a “‘least’’ element oc contained in every other element, and a 
“sreatest”’ element gc containing every other element. One can assign a 
“dimension integer” d(a) to each a ¢ C such that (i) d(oc) = 0, (ii) a covers b if 
and only if a>b and d(a) = d(b) + 1, (iii) d(a) + d(b) = d(a Ab) + d(a Wb). 

By the “dimensions” of C is naturally meant d(gc). To express the analogy 
with projective geometries, let us further call the one-dimensional elements of C 
“points,”’ and the two-dimensional elements, “‘lines.”’ 

Any modular lattice C satisfies P1, since if a and b are any two points of C, 
then 


dim avTb = dima + dim b — dimanAb =1+4+1-0=2 
C also satisfies P2, since under the hypotheses of the latter (replacing the capitals 
by small letters), 
dim (ab) A(d Ve) = dim arb + dim dve — dim arb Wd Ve 
=2+2-dimarcbic=4-3=1 


and f = (ab) A(d We) satisfies the conclusions of P2. Finally, C satisfies the 
first half of P4, by a repetition of the argument proving P1. 


4. Reducibility in complemented modular lattices. Suppose C is a com- 
plemented modular lattice. We can easily show 

Lemma 1: Any element a of C such that d(a) > 0 is the join of d(a) suitably 
chosen points of C. 





i Ges. Werke, Braunschweig, 1932, Vol. II, p. 264. Cf. also the author’s ‘‘On the combina- 
tion of subalgebras,’ Proc. Camb. Phil. Soc. 29 (1933), 441-64. (3.1) shows incidentally 
that the usual theory of linear dependence can be obtained purely abstractly. 








| 








Se, San = oy 
= i 1 


— 


Set 





{i 








746 GARRETT BIRKHOFF 


Suppose a is not itself a point. Then there exists a point c < a, and if c’ 
denotes the complement of c, then 


ev(e’ Aa) = (cuc’)na =a _ [by L5 and L7]. 


Moreover ¢ A(c’ Aa) = (e Ac’) Aa = 0c; hence d(c) + d(c’ na) = d(a). Lemma 
1 now follows by finite induction on dimensions, and application of the same 
principle to c’ Aa. 


5. Conjoint points. As matters now stand, P3 is the only property of 
projective geometries not established in complemented modular lattices. 
The work of §§5-6 will be devoted to settling the status of P3. 

To do this, we first define a relation of “‘conjointness” among the points of C, 
by stating (1) any point of C is conjoint with itself (2) a point p; of C is conjoint 
with a different point p; of C if and only if a third point p; exists contained in 
Pi~Pi- 

Lemma 2: The relation of conjointness has the properties of equivalence—it is 
reflexive, symmetric, and transitive. 

It is reflexive by (1); further, since p; ~p; = p; ~ pi, itissymmetric. Finally, 
if a is conjoint with c, and c with b, then there exist new points e and d on auc 
and cb; consequently either c is a third point on a Yb or else the hypotheses of 
P2 are satisfied, and a new point f exists on a —D. 

Lemma 3 1) the join ppv --- wp, of any set of points of C contains a point q, 
then q is conjcint with at least one of the points pi, --+ , Dr. 

For we can clearly so choose k that q@p,~ --- Ups, yet qEpiwr +++ pet. 
Then prrq must intersect pir --- ~px on a third point (considering the 
dimensional arithmetic), whence by definition p; and g are conjoint. 


6. Direct decomposition. By Lemma 2, the points of C can be divided 
into a number of non-overlapping sets a, --- , a, such that two points are con- 
joint if and only if they are in the same set." Let a, denote the join of the 
points of a, [k = 1, --- , r], and let c be any element of C. 

By Lemma 1, ¢ is the join of those points which it contains, and so 
¢ = (cAm) V--- U(cAa,). This orders to each element of C a set of r ele- 
ments ¢) = ¢Aa,;Cay, --- ,¢, = cna,Ca,. The ordering is a (1,1) correspond- 
ence if (and only if) for every choice of c:Ca;, --- ,c,Ca, and k = 1,---,rit 
is true, writing c for c.U --- Vc, and by force; v +++ Vegans Uv +++ Cr, that 
cAa;, = Cc But by L5 cna, = (cy wb,) aap = Ci ~(b, Aa), and since no 
point can lie in both b; and a; (by Lemma 3), b; Aa, = 0c; hence ¢ Nay = Cx. 

Moreover the ordering preserves inclusion, and therefore is isomorphic 





* In fact, I suspect that conversely P1 + P2 + P4and finite dimensionality imply L2-L7, 
but I have not checked this. ( 

10 Q. Ore has since shown that a similar partition into disjoint classes exists between the 
‘prime factors”’ of any modular lattice; these are “‘points”’ in the case we consider. 


COMBINATORIAL RELATIONS IN PROJECTIVE GEOMETRIES 747 


with respect to meet and join—and consequently carries complements into 
complements. Therefore 

TuEoREM 2: Any complemented modular lattice C of finite dimensions is iso- 
morphic with the direct product of the sublattices A;, ---, A, of elements contained 
inthe joins 01, --+ , @, of its complete sets of conjoint points. 

Those a; containing more than one point define sublattices in which by Lemma 
2 P3 is satisfied; the other a, define Boolean algebras of two elements. Con- 
sequently, uniting the different Boolean algebras so defined, we get 

THEOREM 3: Any complemented modular lattice of finite dimensions is isomorphic 
with the direct product of a finite Boolean algebra and a finite number of projective 
geometries. 

Conversely, since L2—L5 and L7 are preserved under combination into direct 
products, any direct product of a finite Boolean algebra and a finite number of 
projective geometries is a complemented modular lattice of finite dimensions. 


7. Application to group representations.'! Let I be any group of orthogonal 
(or unitary) transformations of a space S. Any linear manifold A invariant 
under T' is said to “half-reduce’”’ ['; the group of linear transformations in- 
duced by lon A is denoted by T,. Clearly if A is invariant under I then so 
is the orthogonal complement A’ of A, while T is determined by I, and T,:. 
For this reason T is said to be “fully reduced,” and we write fr = T, + Ty. 

The linear manifolds of S invariant under IT are evidently a sublattice of the 
projective geometry of all linear manifolds of S—and hence constitute a modular 
lattice My of finite dimensions. But we have seen that each element of M, 
has a complement in M,; that is, My is complemented. 

Therefore by Lemma 1 S has a representation as the join of dim My, “irre- 
ducible” invariant manifolds (i.e., manifolds which contain no proper invariant 
submanifolds). Moreover by Theorem 2 these are unique to within con- 
jointness. 

But if A, B, and C are ona “‘line’”’ of My, then the factor-space S/A formed by 
projecting A into the origin defines a (1,1) linear transformation of B onto C 
carrying T, intol,. That is, by definition, [, and I, are “equivalent” if’? B 
and C are conjointin My. Taken with the above, this shows 

(7.1) Any group of orthogonal (or unitary) transformations can be reduced 
in one and (apart from equivalence) only one way into the sum of irreducible 
components. 


8. Other applications. Similarly, since the adjoint of any compact Lie 
group can be orthogonalized, since invariant linear manifolds of the space of the 
adjoint correspond to normal subgroups of the nucleus of the group, and since 





1 §§ 7-8 were added in revision. 

? The converse is also true—associate with equivalent representations on A and on B the 
manifold C,,y of points z.a + y.6 [x and y fixed numbers; a « A and # « B corresponding 
under the given ‘“equivalence’’]; then C will be invariant. 





i 














748 GARRETT BIRKHOFF 


“conjoint” irreducible manifolds correspond (as Remak has shown by a simple 
analysis)" to “centrally isomorphic” least normal subgroups, which can only be 
one-parameter normal subgroups, we obtain 

(8.1) The nucleus of any compact Lie group is the direct product of simple 
group nuclei and one-parameter group nuclei. The representation as a direct 
product is unique except as to one-parameter group nuclei, and the number of 
these is fixed. 

And finally, once it has been proved that in any “semi-simple” hypercomplex 
algebra A, to any invariant subalgebra S there corresponds a ‘“‘complementary” 
invariant subalgebra S’ of elements (a — ls-a — a-ls + 1s-a-1s) of A [Is the 
principal unit of S], satisfying SS’ = A, SAS’ = 0, it follows directly that 

(8.2) Any semi-simple hypercomplex algebra is the direct sum of simple 
hypercomplex algebras. 


9. Open questions. It was seen at the end of §6 that the converse of 
Theorem 3 held; therefore in one sense conditions L2—L5 and L7 are complete. 
But there are two directions in which work remains to be done. 

For instance,'* can all the laws of combination of projective manifolds be 
obtained from L2-L5 and L7 by the ordinary rules of inference? That this 
cannot be assumed is illustrated by the fact that every finite field obeys the law 
ab = ba, although this is no longer true without the restriction of finiteness. 

Again, it would be desirable to characterize combinatorially those projective 
geometries which are associated with actual vector spaces—or equivalently,” 
which satisfy the Theorems of Desargues and Pascal. 

Other questions are: (1) Can the (non-desarguesian) finite plane projective 
geometries be constructively enumerated? (2) What are the dimensions of the 
“free’’ modular lattice generated by n symbols? of the ‘free’? complemented 
modular lattice so generated? If they are finite, then the answer to the question 
of paragraph two is yes. (3) Characterize in geometrical terms those pro- 
jective geometries in which 


/L7*: (LAM) = L’ SM’, (LOM)! = L’ AM’, and (L’)’ = L. 


Society oF FELLows, Harvarp UNIVERSITY. 





13 “Ober minimale invariante Untergruppen,” Jour. f. Math. 162 (1930), 1-16. 

4 E. Cartan, “‘Groupes simples clos,’’ Jour. de Math. pures et appliquées 8 (1929), p. 11. 

% T.e., algebra without nilpotent invariant subalgebra. 

16 This question was kindly suggested to me by K. Gédel. 

170. Veblen and J. H. Maclagan Wedderburn, ‘‘Non-Desarguesian and non-Bascalian 
geometries,’’ Trans. Am. Math. Soc. 8 (1907), 379-88. 








Annas OF MATHEMATICS 
Vol. 36, No. 3, July, 1935 


ON A CONVERSE OF KNESER’S TRANSVERSALITY THEOREM! 
By Lincotn LaPaz anv TrBor Rapé 
(Received August 17, 1934) 
§0. Introduction 


0.1. Let there be given, in the three-dimensional Euclidian space (2x, x, t),? a 
transversality 7, that is to say a one-to-one correspondence between line- 
elements and surface-elements with the same base-point, and a four-parameter 
family ® of curves. Consider a surface 2, and denote by ¢® the family of those 
curves of ® which are transversal to 2, according to the given transversality 7’. 
There may or may not exist a one-parameter family S“ of surfaces, such that 
the surfaces of S® are transversal to the curves of ¢®’. 

If such a family S® exists for every choice of the surface =, then there exists a 


variation problem 
5f G(x, 22, t, 41, %2) dt = 0, 


such that the curves of the family ® coincide with the extremals and the transversality 
T coincides with the Kneser transversality of the problem.® 

The purpose of this paper is to give a new proof of this theorem. Previous 
proofs,‘ of varying generality, are due to Kasner, Lipke, Schouten, Blaschke and 
Douglas. We are going to discuss briefly the relations of our paper to previous 
literature. 

0.2. The method of Douglas is based on an important theorem of Vessiot.° 
Let 7(p) be the general element of a one-parameter group of contact trans- 
formations, depending upon the parameter p. Consider a surface-element o 





1 Presented to the American Mathematical Society, Chicago, April 1934. 

* The notation (x, z2, t) was chosen to indicate that our developements remain valid in 
any Euclidean space (x, *-+ , tn, t). 

’ The exact statements of the notions and theorems occurring in this introduction will be 
given in §1. 

‘E. Kasner, Differentialgeometric aspects of dynamics, Princeton Colloquium Lectures, 
1909, pp. 35-44.—J. Lipke, Natural families of curves in a general curved space of n dimen- 
sions, Transactions of the Amer. Math. Soc., vol. 13, 1912, pp. 77-95.—J. A. Schouten, 
Uber die Umkehrung eines Satzes von Lipschitz, Nieuw Archief v. Wiskunde, vol. 15, 1928, 
pp. 97-102.—W. Blaschke, Eine Umkehrung von A. Knesers Transversalensatz, ibid., pp. 
202-204.—J. Douglas, Extremals and transversality of the general calculus of variations 
problem of the first order in space, Transactions of the Amer. Math. Soc., vol. 34, 1927, pp. 
401-420. 

*E. Vessiot, Sur l’interprétation mécanique des transformations de contact infinitésimales, 
Bull. Soc. Math. France, vol. 34, 1906, pp. 230-269. J. Douglas, loc. cit.‘, gives a new proof 
for the theorem of Vessiot. His proof should be supplemented by an easy discussion of 
certain jacobians. Such a discussion would show 1) that the theorem of Vessiot ceases to 
be valid in certain trivial exceptional cases, and 2) that these exceptional cases are ruled 
out by our assumption that the given family ¢ is a four-parameter family. 


749 

















ow BES BR a 


> 





750 LINCOLN LAPAZ AND TIBOR RADO 


with base-point P. Denote by o() the transform of o under r(p), and by P(p) 
the base-point of o(p). For fixed o and variable p, the point P(p) describes a 
curve I’ which is called a path-curve of the group 7(e). The line-element \, 
tangent to T at P, and the surface-element o are said to be conjugate to each 
other with respect to the group r(p). According to the theorem of Vessiot, there 
exists then a variation problem 


5 f G(x, 22, t, 41, 42) dt = 0, 


such that the path-curves of 7(p) coincide with the extremals of the problem, 
while conjugate elements are transversal to each other with respect to the 
problem (in the Kneser sense). 

0.3. Douglas starts with the remark that, on account of the theorem of 
Vessiot, it is sufficient to prove the following statement. 

If & and T satisfy the condition stated in 0.1, then there exists a one-parameter 
group of contact transformations such that the path-curves of the group coincide 
with the curves of &, while conjugate elements, with respect to the group, are trans- 
versal according to T. 

0.4. We start with this same remark, that is to say we consider the statement 
in 0.3 as our objective. Otherwise our method is totally different from that of 
Douglas. While the proof of Douglas is based on computations, our method is 
purely geometrical. 

The statement in 0.3 is concerned with a purely geometrical situation. The 
present paper originated with our desire to understand this geometrical situation 
in a geometrical manner. 

0.5. An important device in our method was suggested by the paper of 
Blaschke, who considers the generalized arc-element 


G(a, v2, é, th, £2) dt 


as the unknown quantity. Blaschke actually succeeds in determining this line- 
element by a generalized congruent displacement of infinitesimal arcs. In our 
own method, we apply the same displacement to finite arcs. We are led this 
way to a congruence relation (see §6), which enables us to exhibit, without any 
computations, the group r(p) (see §7). 

0.6. In view of the fact that the paper of Blaschke is barely three pages long, 
we feel that we should try to explain the (comparatively speaking) extreme 
length of our paper. 

Expert criticism, which we had the privilege to utilize while preparing the 
present final version of our paper, revealed the necessity of great caution in 
presenting the vey simple geometrical idea of our proof. As a consequence, we 
felt that we should not even try to imitate the extreme conciseness of the paper 
of Blaschke. However, the brevity of the proof of Blaschke is not a matter of 
clever presentation alone. While working out the details of our paper in the 
cautious manner we felt we should adopt, we hit upon a difficulty,’ apparently 





6 See §8. 


Ae —_— ws 








ON A CONVERSE OF KNESER’S TRANSVERSALITY THEOREM 751 


overlooked by Blaschke, which forced us to add a number of pages to our 
manuscript. For the sake of accuracy, it should also be observed that our result 
is more complete. Let us recall that a transversality 7, in order to be the 
Kneser transversality of a variation problem, has to satisfy a simple and beauti- 
ful (necessary and sufficient) condition, discovered by Kasner.’' A transver- 
sality 7’ satisfying this condition is said to be of special type. The wording 
of the theorem in 0.1, as it appears in the paper of Douglas, contains the state- 
ment that the given transversality T is necessarily of special type.”* The 
brevity of the proof of Blaschke is partly due to the fact that he presupposes 
the special character of T. This restriction is not used in our present paper. 

0.7. In order to avoid certain rather obvious but quite annoying details, in 
particular in §2, we suppose that the family # and the transversality T are given 
in the large (see 1.1 and 1.3). Except for this, the assumptions we make are 
easily seen to be satisfied for non-singular analytic variation problems. The 
special case when ® consists of straight lines and T' reduces to orthogonality, 
shows immediately that our assumptions are not vacuous. This is quite an 
important point, in the light of the remarks in §8. 


§1. Preliminaries 


1.1. The family &. We suppose that we are given a family ® of curves I, in 
the (a1, £2, t)-space, with the following properties: 
I. Every curve I of the family is represented by a system of equations 


x, = 2 (t), Le = 22(t), 


where 2 (t), z2(t) are analytic functions of ¢ in a certain interval t’ < t < t’’, 

II. Through every point there passes in every direction, not parallel to the 
(x1, 2)-plane, a unique curve I of the family. 

1.2. We shall denote line-elements by \, surface-elements by o, everything 
being considered in the (2, x2, t)-space. Line-elements and surface-elements 
will be represented in the usual manner by five coérdinates (x1, 22, t, 41, Z2) and 
(21, £2, t, pi, P2) respectively. The use of this representation implies the assump- 
tion, which we shall use throughout this paper, that we only consider line-elements 
not parallel to the (x1, t2)-plane and surface-elements not parallel to the t-axis. 





’E. Kasner, Transversality in space of three dimensions, Transactions of the Amer. Math. 
Soc., vol. 30, 1928, pp. 447-452. 

™ The statement of the theorem as given by J. Douglas in his paper, loc. cit.,* p. 404, 
is as follows: “If in connection with a family § of ~* curves in space definable analytically 
in the form (5) there exists a transversality T, necessarily of special char. cter, such that 
the ~* curves of § which meet transversely an arbitrary base suface (union) = admit =! 
transverse surfaces, then § must be the system of extremals of a calculus of variations prob- 
lems whose transversality is T.’? According to an oral communication of Professor Doug- 
las made in September, 1934, his explicit reference to the necessarily special character 
of 7’ was not meant by him to imply that he presupposed the transversality T to be of 
special character. 








Sa 


a ae 


| 





a en Sn apni 





752 LINCOLN LAPAZ AND TIBOR RADO 


1.3. The transversality T. We suppose that there is given a correspondence 
Pi = &:(X1, Lo, t, %1, £2), t=1,2 (1) 


with the following properties: 

I. & and & are analytic functions for all finite real values of their arguments. 

II. For every point (x, 22, t), the equations (1) define a one-to-one cor- 
respondence between the couples (#1, #2) and (pi, pe). 

Under these assumptions, the equations (1) define a one-to-one correspond- 
ence between the line-elements and the surface-elements with the same base- 
point (21, x2, t), for every choice of this base-point. We add the further assump- 
tion that 

III. No line-element is comprised in the corresponding surface-element. 

A correspondence with these properties will be called a transversality. We 
suppose that we are also given a transversality, and we shall denote it by T. 

1.4. Surfaces 2. Whenever in the sequel we shall use 2 to denote a surface, 
this notation will imply the following properties. The surface is given by an 
equation 


t= F(x, 2), (m1, 22) Cc D, 


where D denotes a simply connected bounded domain in the (2, 22)-plane. 
The function F(x, x») is single-valued and analytic in D and on the boundary 
of D. 

If we consider several surfaces 2 at a time, it will not be presupposed that 
they are given above the same domain D. 

1.5. Cables. Suppose there is given 1) a simply connected domain RF in 
(x1, 22, t)-space; 2) a (two-parameter) family ¢ of curves y, which are open 
sub-ares of curves of the family , such that R is simply and completely covered 
by the curves of ¢® ; 3) a (one-parameter) family S® of surfaces 2 such that R is 
simply and co:apletely covered by the surfaces of S“; 4) at each point Q of the 
domain R the curve y of ¢® and the surface = of S® are transversal to each 
other, according to the given transversality 7; 5) every curve y of 9® intersects 
every surface 2 of S® in exactly one point within R. 

Under these conditions, we shall call the figure consisting of R, ¢®, S® a 
cable. The curves y of ¢® are the fibers and the surfaces 2 of S® are the cross- 
sections of the cable. 

1.6. We shall say that the given family @ and the given transversality T 
satisfy the cable-condition, if for every surface = there exists a cable K such that 
2 is a cross-section of K. We suppose in the sequel that this condition is satisfied. 

1.7. We shall use the symbol 


(2, K, Pi, Po, r;, Ts, P,, P,) (2) 


to refer to the following figure: K denotes a cable in which the surface 2 is a 
cross-section. I; and I: are curves of the family & which are transversal to 2 
at the points P; and P2 (located on 2). The fiber of K through P; is then an 


ON A CONVERSE OF KNESER’S TRANSVERSALITY THEOREM 753 


open sub-are 7; of T; which contains P;(i = 1,2). P, isa point ony. There 
passes through P,; a unique cross-section of K which intersects y2 in a unique 
point P.. 

1.8. Given 2, K, P;, Po, T:, T's as described in 1.7, it is clear that every point 
P, on T;, close enough to P, gives rise to a figure (2). 

1.9. We shall see later on that the point P: in a figure (2) is univocally deter- 
mined by P1, P2, Ti, T2, P,. For the moment, we only need the simple fact that 
P, is univocally determined by Pi, P2, T;, %2, Piand =. We shall verify a some- 
what stronger assertion. Let us connect P; and P2 by an analytic curve C 
on 3, and let us denote by s the first-order strip consisting of C and of the tangent 
planes of ZalongC. Then Ps is univocally determined by P;, P2, V1, U2, Py and s. 

1.10. To see this, let us consider two figures 


(21, K,, P,, Po, Tr, TP, P,, Py») (3) 


and ms 
(22, K2, P;, Po, Yr, Ye, P,, P22), (4) 


which have in common the elements P;, P2, T;, T2, P;, and such that 2; and >, 
are in contact along an analytic curve C which passes through P; and Pz. We 
have to verify that P)z = Poo. 

Let E be a point on C, and consider the curve I of the family @ which is 
transversal to the common tangent plane of 2; and 22 at E. The fibers of K; 
and K: through E are then open sub-ares of T which contain E. Let us denote 
by y the common part of these fibers, and by y the (one-parameter) family of 
the ares y corresponding in this manner to the points of C. The curves y of ¢ 
constitute then a surface which we denote by A. 

At each point Q of A, we consider the tangent plane of A and the plane which is 
transversal to the curve y of ¢™ passing through Q. These planes intersect in a 
line-element \ with base-point Q.8 The line-elements \ obtained in this manner 
constitute a field of directions on A. A curve on A which at each of its points has 
the direction prescribed by this field, will be called a transversal trajectory of the 
family gy. On account of well-known theorems on systems of ordinary differ- 
ential equations, a transversal trajectory is univocally determined by any one of 
its points, and, of course, by the family 9. 

By the definition of the figures (3) and (4), the points P; and Pi, are located 
on the same transversal trajectory 4; of g. Similarly, the points P; and P2» 
are located on the same transversal trajectory 52 or ¢. But 6; and 62, being 
transversal trajectories of ¢ through the same point P,, are identical. Since 
P» is the intersection of 5; with I's, and P22 is the intersection of 5: with T'2, and 
since 6, = 52, there follows that Piz = Po». 

1.11. The condition of analyticity. Suppose that in a figure (2) the surface 2 
depends analytically upon a finite number of parameters a, a2, --- , ax, that 
P;, Ps vary on ~, and that P, varies on I; in the vicinity of P;. According to 





* Cf. condition III in 1.3. 


~ aati 





a 
j 
: 
: 
i 


: 
; 
: 
i] 

















754 LINCOLN LAPAZ AND TIBOR RADO 


1.9, the point P. and consequently the line-element Xe (which is tangent to I, at 
P.) are univocally determined by 2, Pi, P2, T:, T2, Pi. On the other hand, Ir; 
is determined by 2 and P; as the unique curve of the family ® which is trans- 
versal to = at P;(i = 1,2). The codrdinates of Pz and of Xz are therefore func- 
tions of the parameters ai, a2, --- , ax, of the codrdinates a), 2%), 2?) 2?) of the 
(2;, 22)-projections of the points P;, P2 and of the difference At between the 
t-coordinates of P,; and P;. Our condition of analyticity requires that the co- 
ordinates of P: and Xz be single-valued analytic functions in the vicinity of every 


set of values (x), 2$!?, 2°??, 29, a1, --+ , ax, 0), such that the surface > is de- 


fined for (2), 2), a1, «++ , ax) and for (29, 2{??, a1, +++ , ax). 
1.12. Toe THroreM. Given a family ® (see 1.1) and a transversality T (see 
1.3), satisfying the cable-condition (see 1.6) and the condition of analyticity (see 


1.11), there exists a one-parameter group r(p) of contact transformations 


(z = ¥i (x1, X2; t, X1, x2, p) ? 


Eo = £o(21, La, t, 41, Lo, p) , 


on 


t(p): 4 = t(21, 2, t, 1, £2, p) ’ (5) 


p(x, Xe, t, X41, Xe, p) ’ 


P1 





(Pp: = p2(2X1, Za, t, di, Le, p) ’ 


with the following properties: 

I. For every set of finite real values «°°, 25°, t°, #§, @L, the functions appear- 
ing in (5) are analytic in the vicinity of (2°, 2, t©, #6, a9, 0). 

II. If we denote by o(p) the transform of a surface-element o under r(p), then for 
fixed o and variable p the base-point of o(p) varies on the curve T of the family ® 
which is transversal to o. 

This group 7r(p) will be defined in §7, on the basis of a congruence relation 
introduced and discussed in §6. 


§2. Remarks on the contact of surfaces 
2.1. Suppose we are given, in the (z, y, z)-space,? two analytic surfaces 
25:2 = f(z, y), +=1,2 
in the vicinity of x = 0, y = 0, such that 
FO, 0) = f0,0), F270, 0) = f2(0, 0), f,(0, 0) = F/(0, 0). 
Through (0, 0) we choose two analytic curves c; in the (z, y)-plane with equations 


Ci: y = gi(x), i=1,2. (6) 





* This notation is used here to indicate that we are using an auxiliary system of co- 
ordinates, which have to be chosen so as to fit the situation in 3.2. 








ON A CONVERSE OF KNESER’S TRANSVERSALITY THEOREM 755 


We denote by C; the curve on the surface 2; whose (zx, y)-projection is c;. Let 
us ask for a surface 


2 :z = f(z, y) 


in the vicinity of (0, 0), which is in contact with the surface 2; along the curve 
(i = 1, 2). Supposing such a surface exists and has continuous first and 
second derivatives, we obtain by simple computations that the following equa- 
tion must be satisfied at (0, 0): 


= as a + ( ota — f2G6; + 91) + ( ma —fo)g39: = 0. (7) 
If the surfaces 2}, 22 have a contact of the second order above (0, 0), then (7) 
does not imply any restrictions, since then 


2) 1) 2) 1) 2) 1) 
22 —~Jez = 9, zy —~Jzy = 9, sy ~Jyy = O 


at (0,0). In the general case however there follows from (7) that the curves (6) 
cannot be chosen arbitrarily if the surface = is to have continuous second deriva- 
tives. This situation, as well as certain minor difficulties of a topological 
character, forced the authors to go into the details developed in §2 to $5, instead 
of reproducing a three-line reasoning of Blaschke (cf. the remarks in §8). 

2.2. Let there be given, on the x-axis in the (z, y)-plane, two points (£;, 0), 
(2, 0), where & < &, and let there be given two real numbers yu; ~ 0, ue ~ 0. 
We ask for a curve y = g(x) with the following properties: 


I. g(x) is analytic for - ©» <4 < + 0, 

IT. g(&) = 0, g(2) = 0. 

TIT. g’(&) = mi, g’(2) = me. 

IV. g(x) ¥ 0 for all values of x different from & and é:. 


It is clear that if sgn 4: = sgn ue, then this problem is impossible. Indeed, in 
this case sgn g(£; + €) = —sgn g(£2 — e) for small positive values of «. Hence 
g(x) must have some zero between £, and £2, in contradiction with condition IV. 
If however 


Sgn pi ~ sgn po, 


then the problem is clearly possible. In the special case when w. = —y2, the 
problem can be solved by a function g(x) of the form Az? + Br + C. The 
general case can be reduced to this particular case by a transformation of the 
form = ae, 7 = y, where a and B are properly chosen constants. The very 
elementary details are left to the reader. 

2.3. Let there be given two surfaces 


Z,:2 = f(z, y), t= 1,2, 








| 
i | 
| 
| 








754 LINCOLN LAPAZ AND TIBOR RADO 


1.9, the point P, and consequently the line-element Ne (which is tangent to I, at 
P.) are univocally determined by 2, P:, P2, T:, T2, Pi. On the other hand, Tr, 
is determined by = and P; as the unique curve of the family ® which is trans- 
versal to = at P;(i = 1,2). The codrdinates of Pz and of X¢ are therefore func- 
tions of the parameters a, a2, --- , az, of the coordinates x", x$'), x{??, x?) of the 
(x), 22)-projections of the points P:, P2 and of the difference At between the 
t-coordinates of P; and P;. Our condition of analyticity requires that the co- 
ordinates of P, and i be single-valued analytic functions in the vicinity of every 
set of values (x), 29, x%?’, 29, a1, --+ , ax, 0), such that the surface > is de- 
fined for (2), 2$, a1, «++ , ax) and for (x?), 2, a1, --+ , ax). 

1.12. Tue THEorEeM. Given a family ® (see 1.1) and a transversality T (see 
1.3), satisfying the cable-condition (see 1.6) and the condition of analyticity (see 


1.11), there exists a one-parameter group r(p) of contact transformations 

(t, = F,(a1, 22, t, a1, £2, p) , 

Fo = £0(21, Xe, t, 41, Xo, p) , 

t(p): {t= U(x, Xe, t, a1, Xe, p) , (5) 


Pi = p(x, Xo, t, X1, Lo, p) , 





iP: = p(X, T2, t, XA, xa, p) ’ 


with the following properties: 
» (0) 


I. For every set of finite real values x§°?, 2°, t©, #0, #9, the functions appear- 


ing in (5) are analytic in the vicinity of (x§, 2S, ©, #0, #9, 0). 


II. If we denote by o(p) the transform of a surface-element o under r(p), then for 
fixed o and variable p the base-point of o(p) varies on the curve T of the family 
which is transversal to o. 

This group 7(p) will be defined in §7, on the basis of a congruence relation 
introduced and discussed in §6. 


§2. Remarks on the contact of surfaces 
2.1. Suppose we are given, in the (z, y, z)-space,® two analytic surfaces 
ri:2 = f(z, y), +=1,2 
in the vicinity of x = 0, y = 0, such that 
fP(0,0) = f?0,0),  f27(0, 0) = f0, 0), f'(0, 0) = F,(0, 0). 
Through (0, 0) we choose two analytic curves c; in the (zx, y)-plane with equations 


ci: y = Q(z), i=1,2. (6) 





* This notation is used here to indicate that we are using an auxiliary system of co- 
ordinates, which have to be chosen so as to fit the situation in 3.2. 








ON A CONVERSE OF KNESER’S TRANSVERSALITY THEOREM 755 


We denote by C; the curve on the surface 2; whose (x, y)-projection is c;. Let 
us ask for a surface 


Z:z = f(z, y) 


in the vicinity of (0, 0), which is in contact with the surface 2; along the curve 
(,(i = 1, 2). Supposing such a surface exists and has continuous first and 
second derivatives, we obtain by simple computations that the following equa- 
tion must be satisfied at (0, 0): 


$2 -—f22 + GR -IANG2 +9) + 0 -—Sorg1 = 0. (7) 
If the surfaces 2}, 22 have a contact of the second order above (0, 0), then (7) 
does not imply any restrictions, since then 


2) 1) 2) 1) 2) 1) 
22 ~Je2 = 9, sy ~Jey = 9, vy ~ Jey = 9 


at (0,0). In the general case however there follows from (7) that the curves (6) 
cannot be chosen arbitrarily if the surface = is to have continuous second deriva- 
tives. This situation, as well as certain minor difficulties of a topological 
character, forced the authors to go into the details developed in §2 to §5, instead 
of reproducing a three-line reasoning of Blaschke (cf. the remarks in §8). 

2.2. Let there be given, on the z-axis in the (z, y)-plane, two points (&, 0), 
(2, 0), where & < &, and let there be given two real numbers uw ~ 0, we ~ 0. 
We ask for a curve y = g(x) with the following properties: 


I. g(x) is analytic for — ~» <r< +4 0. 

Il. g(é) = 0, g(é2) = 0. 

IIT. g’(&) = mi, g’(é2) = me. 

IV. g(x) ¥ 0 for all values of x different from & and £2. 


It is clear that if sgn 4; = sgn ue, then this problem is impossible. Indeed, in 
this case sgn g(f + €) = —sgn g(£2 — e) for small positive values of «. Hence 
g(x) must have some zero between £; and £2, in contradiction with condition IV. 
If however 


sgn wi ~ sgn pe, 


then the problem is clearly possible. In the special case when yu: = —ye, the 
problem can be solved by a function g(x) of the form Az? + Br + C. The 
general case can be reduced to this particular case by a transformation of the 
form = ae, 7 = y, where a and # are properly chosen constants. The very 
elementary details are left to the reader. 

2.3. Let there be given two surfaces 


Z,:2 = f(z, y), t= 1,2, 

















756 LINCOLN LAPAZ AND TIBOR RADO 


analytic for — ~» <x<+%,-—- 0 <y<-+ &. Suppose we have two 
points (£, 0), (2, 0) upon the z-axis, such that 


f? (Ey 0) = Sf? (&; 0), IO (Ey 0) _ $2? (&, 0), I) (ee, 0) = I (&, 0), 
k = 1,2. 


In other words, the surfaces 2), 22 are in contact with each other above the points 
(£,, 0) and (2,0). We also make the following additional assumptions: 


Io Nin 0) # IONE, 0) ’ IO) (is 0) aa IE, 0) ) k= 1, 2, (8) 


f2a, 0) — Ong. 0) xo I 22s, 0) a S22, 0) 
” F2%E, 0) — FG, 0)” F2%E, 0) — FOE, 0) 


We shall see presently that under these assumptions there exists a surface 


= 3:2 = f(z, y) 








sg (9) 


with the following properties: 
I. f(z, y) isanaiyticfor— © <r<+nv,-—-xan<cy<c4+o, 


II. > and Z,(7 = 1, 2) are in contact with each other along an analytic curve 
C; whose (2, y)-projection is an analytic curve passing through (é, 0) and 
(2,0). This projection admits of an equation y = g;(x), where g;(z) is analytic 
for— 27 <r<+o, 

2.4. The preceding assertions can be verified as follows. We choose 


g(t) =0, (10) 
while g2(x) is chosen as a solution of the problem considered in 2.2, with 
(2) 1) 
be = eatin 0) — f(b, 0) De tan + 7 (11) 





ff (&, 0) — fO2(E, 0) ” 
On account of (9), we have sgn yw: ¥ sgn we, and hence the problem of 2.2 is pos- 
sible. Thus we have a function g(x) with the following properties: 


I. go(x) is analytic for -—~w <xr< +o, 

II. go(&i) =0,k = 1,2, 

III. g3(&) = px, k = 1,2, where pz is given by (11), 

IV. go(a) ¥ 0 for all values of x different from £; and £. 

We also notice that we have, on account of (8), the property 


V. go (tx) #0,k = 1,2. 

2.5. We denote the curves y = g,(xz), i = 1,2, by c;. We further denote by 
C; the curve on 2; whose (z, y)-projection is c;. We are going to exhibit a 
surface 


Z:z=f(z,y), 








ON A CONVERSE OF KNESER’S TRANSVERSALITY THEOREM 757 


analytic for - © <r™<+%,—- 2% Cc y<+ 2, which is in contact with 
y, along Ci,i = 1,2. We look for f(z, y) in the form 


f(x, y) = ay? + by + cy +d, 
where a, 6, c, d are functions of x. If we put 


yi= gi(x), 2;=f" (rz, yi), mi =f (a, ys), i= 1,2, (12) 


* then the geometrical conditions upon = are expressed by t ‘e following system of 


equations: 
ay; + byi +eytd =z; | 
Ps i=1,2. (13) 
Bay; + 2by; + ¢ =m, 
Our problem is to show that these equations can be solved by functions a(x), 
b(x), c(x), d(x) which are analytic for — © <xr< +4 o. 

2.6. Let us first suppose that x ~* &, k = 1, 2. Then we have yo * y; on 
account of the very definition of yz and y:. The determinant of the system (13) 
is found to be equal to — (ye — y:)*, and is therefore different from zero for x # 
t,,k = 1,2. Since y;, z:,m; are analytic functions of z for - ~ <4 <4 ~, 
there follows that the system (13) can be satisfied by functions a(x), b(x), c(x), 
d(x) which are analytic for — » <r2<+0,x% # &,k=1,2. We have to 
show that these functions remain analytic for x = &: and x = éb. 

2.7. Suppose r ¥ &,k = 1,2. From the system (13) we obtain by simple 
computations: 





ae 2 2H 2 (me + mM) (ye — 1) (14) 
(yz — yi)3 , 
b = 2 _ Bays + m1). (15) 
Y—- YN 


Both the numerator and the denominator in (14) are analytic for 
—-*x<r<c4+o., 


The denominator vanishes only for x = £ and x = &, and has there zeros of 
order three, on account of (10) and V in 2.4. The function a(x) remains there- 
fore analytic for z = & and x = £2, provided the numerator in (14) has zeros 
there of order => 0. But this is actually the case, as follows immediately by 
considering the Taylor expansions of y;, zi, m; resulting from (12) and using 
ITI in 2.4. 

Let us consider 6 as given by (15). The fraction (mz — m)/(y2 — y1) remains 
analytic for z = &,k = 1, 2, since the denominator has there zeros of order one, 
while the numerator also vanishes there. Since a has already been recognized 
as analytic for — 0 < x < + o, there follows therefore from (15) that 6 is 
also analytic for — «© <a < + ». The analyticity of c and d follows then 
directly from the equations (13). 











aes ape REALS Gos 
ms . 


a s 


758 LINCOLN LAPAZ AND TIBOR RADO 


2.8. Let there be given two surfaces which we denote this time by 
X1:2 = fe, y), 
Zs 2 = f(z, y). 


We suppose that there are two points (é, 0), (£, 0) on the z-axis, such that 
>, and 2; are in contact with each other above these points. We denote by P 
the common point of 2; and 2; above (é, 0) and by Q the common point of 3, « 
and 2; above (£2, 0). 

We shall see presently that there exist three auxiliary surfaces 


Zn 2 = f(z, y), k = 2,3,4, 


analytic for —- » <4< + 0,— © <y < + », such that forz = 1, 2, 3,4 
the surfaces 2; and 2,4; are in contact with each other along an analytic curve 
C; which passes through P and Q. 

This follows immediately from 2.3 by the obvious remark that we can deter- 
mine (in many ways) a surface 2; such that the pairs 2, 23 and 23, 2; both satisfy 
the conditions stated in 2.3. 

2.9. Let us modify the assumptions in 2.8 by presupposing that the surfaces 
>, and &; are given only above a certain rectangle 


Ria<2x<B,—-y<y<y7,y>0. 


Then the lemma of 2.8 still holds, with the modification that the existence of the 
surfaces 22, 23, 24 follows only above the same rectangle R. 
Indeed, the transformation 


ie T a+f8 
a tan| 7 (2-*3*) |, 


gj =tan—y 
2y 


carries the interior of R into the whole finite (z, 7)-plane, and thus the present 
situation is reduced to the one considered in 2.8. 











§3. A lemma 
3.1. Lemma. Let there be given two figures 


(Xi, K,, Pr. P», I, Ts, P,, P 2) 
and 


(2s, Ks, Pi, Po, Ti, T2, Pi, Pe) , 


such that the straight segment connecting the (a1, x2)-projections of P; and Pz 1s 
comprised in both of the domains above which the surfaces 2, and 2s are defined. 
Suppose that 21, 23, Ki, Ks, Pi, Pz, Ti, T2 are kept fixed, while P; varies on 1: 
and consequently Py and Ps. vary on T >. 

Then P12 = Ps2 for P, close enough to P,. 

3.2. To prove this, we observe that we have, on account of 2.9, three auxiliary 











ON A CONVERSE OF KNESER’S TRANSVERSALITY THEOREM 759 


surfaces D2, Zs, 24, such that 2; and 2;,; are in contact with each other along an 
analytic curve C; which passes through P; and P2 (i = 1, 2,3, 4). The surface 
3, is then a cross-section of a cable K,(j = 1, --- ,5). If Pion Py is close enough 
to P;, then P, will be comprised in all five cables K; and we have five figures 


(2;, Kj, Py, P., ri, Ts, P,, Pi2) . 
From 1.10 there follows then successively 


Py, = Por = Poe = Pye = Pg. 
Hence Piz = Pes. 


§4. Statement and reduction of the fundamental lemma 
4.1. Let there be given a finite number of curves 
Ti, Te, +++, Pay Png, sei, 


of the family @. These curves are not supposed to be all different from each 
other. Choose a point P; = (24°, 2S, ) on I; (¢ = 1,2, ---, n,n + 1) in 
such a way that the (2, x2)-projections of P; and Pj; are different from each 
other (i = 1, 2,---,m). Choose then simply connected bounded domains D; 
in the (1, z2)-plane, such that D; contains the (2, x2)-projections of P; and 
Pii(i = 1,2, ---,n). Choose finally surfaces 


Zi:t = F(x, 22), t=1,2,---,n, 


with the following properties: 
I. F(a, 22) is analytic in the interior and on the boundary of Dj. 
II. 2; passes through the points P; and P;4:, and is there transversal to the 


curves I’; and I';,: respectively. 
We shall call such a configuration a chain and we shall use for it the notation 


r,, rs, a ae P41 
WV = | P,, Po, --- , Pay Patil. (16) 
Li; 22, tery Zn 


We shall also write Y, whenever it will be convenient to indicate the number of 
surfaces 2 appearing in the chain. 

4.2. A chain is called polygonal if the domain D; contains the straight segment 
joining the (21, x2)-projections of P; and Pi4:(i = 1, --- , ). 

4.3. A chain is called closed if M41 = T1, Pass = Pi. 

4.4. Let there be given two chains Y, V, the first by (16) and the second by 


r,, rs, “ere ,t., are 
Vv = P,, Pa, --+ , Pay Pass e (17) 














760 LINCOLN LAPAZ AND TIBOR RADO 


These chains will be called associated if 1) T; = T;,7 = 1,2,--- ,nsn+ 1, and 
2) =, and =; are cross-sections of the same cable K ,(i = 1, 2, --- , n). 

4.5. THe FuNpDAMENTAL Lemma: If W is a closed chain, then every chain ¥, 
associated with V, is also closed. 

We shall show presently that this lemma can be reduced to a simpler one 
which will be proved in §5. 

4.6. We first observe that it is sufficient to prove the following weaker state- 
ment. 

Tue RestricreD Lemma:” Let a closed chain be given by (16). Then there 
exists an open sub-arc y; of Ti, containing P,, such that every chain 


Nr, Ts, oo # Dns re 
S aw i Py, Py, ->- , Pee Presa t (18) 
Zi, Ze +++, Dn 


associated with V, is also closed provided P, is on 71. 

4.7. Let us suppose that this restricted lemma has already been proved. 
Then the fundamental lemma of 4.5 can be established as follows. The asso- 
ciated chains V and V being given by (16) and (18), the surfaces 2; and 2; are 
by assumption cross-sections of the same cable K;. We have therefore (see 1.7) 
n figures 


(2;, Ki, Ps, Pin, Ts, Tins, Pi, Pir). 


Let P* be a point on I’; between P; and P;. Then P? is comprised in the cable 
K,, and thus there passes through P{ a cross-section 2} of Ki. Then 37 inter- 
sects Iz in a point P} between P, and P», and we have the figure 


(21, Ki, Pi, Ps, T:, Ts, P?, P3). 
Applying the same construction to P}, we obtain a figure 
(Sa Ke Pa Pa Tu Ta hee Fs), 
andsoon. There arises in this manner a chain 
Ty, Te +++, Ta, Pasa 
v* =| Pi, P2,---, Pa, Pasi], 
a x 


which is associated both with V and ¥. 


Let us denote by S the set of those points P] on the closed sub-are PP of 
T, for which P? = P*,,. If Pt varies eatinnmadl: Gee Fe..++> «Pas Post 
also vary continuously. It is then obvious that the set S is aed If a point 





10 This lemma is essentially equivalent to the fact, important in the proof of Blaschke, 
loc. cit.4, that the displacement of infinitesimal arcs, introduced by him, is integrable. 
See §8 concerning the proof of Blaschke. 








id 





ON A CONVERSE OF KNESER’S TRANSVERSALITY THEOREM 761 


P* belongs to S, then every point close to P* on I; also belongs to S, on account 
of the restricted lemma. Thus the points of S between P; and P, constitute an 
open set on Ts. Finally, S is not empty, since P; certainly belongs to S. From 
these properties of S there follows, by a familiar reasoning, that S contains all 
points of the closed sub-are P,P; of T:. Thus, in particular, P, is in S, and this 
is exactly the fact asserted in the fundamental lemma. 

4.8. We next observe that it is sufficient to prove the restricted lemma of 4.6 
in the special case when the given chain is polygonal (see 4.2). Indeed, let there 
be given a closed chain (16), and let (18) be a chain associated with (16). Let 
us consider the simply connected domain D; above which the surface >, is 
defined. This domain D; contains the (21, x2)-projections of the points P; and 
Pi. Let us connect these projections by a simple polygonal line 7; in Dj. 
Let us denote by 


PS) = P,;, PY PY, .-- Pee = Pi (19) 
the points of 2; above the vertices of r;, and by 
2 as = ri, om ’ iT’. we rye) = Diss (20) 


the curves of the family ® which are transversal to 2; at the points (19), and 
finally by 
PY = P,, PY PP, .-. Pye = +1 
the intersections of the curves (20) with =;. Let us put 
>) a rs >?) ee plrr-)) amie z.. 

By introducing these new elements, we obtain from WV and W new chains ¥* 
and ¥* which are again associated. The new chain ¥* is again closed, and it is 
also polygonal. Hence, if we presuppose that the restricted lemma of 4.6 has 


already been proved for the polygonal case, there follows that ¥*, and conse- 
quently W also, is closed, provided P(°) = P, is close enough to P{°) = P,. 


§5. Proof of the fundamental lemma 
5.1. On account of 4.8, it is sufficient to prove the following statement: If 
T,, Ts, ---, Ta, U1 
W, =| Pi, Po, ---, Pa, Pi 
i, a 


is a closed polygonal chain, then there exists on TI’; an are y; containing P,, such 
that every associated chain 
r, Te, +++, r,, Yr; 


. WG, =| Pi, Pa, ---, Pry Pass 


is closed, provided P, is on 7. 


‘4 
rs 
; 

if 














{ 
4 : 
i. 
} 
4 





762 LINCOLN LAPAZ AND TIBOR RADO 


5.2. Let us first observe that for n = 2 this assertion differs only in the 
wording from the lemma of 3.1. This settles the case n = 2. 

5.3. Let us consider then the case m = 3. Then, by the definition of a chain 
(see 4.1) the (21, x2)-projections P |» Ps, P; of the points P,, P2, Ps are distinct 
points. Let us denote by D a simply connected bounded domain in the (z,, 2,)- 
plane which contains the straight segments joining the points Pj, P,, P,. 
Choose a surface 

> :t = F(a, x2) 


with the following properties: 

I. F (2, 22) is analytic in the interior and on the boundary of D. 

II. The surface = passes through the points Pi, P2, P; and is there transversal 
to the curves I, I's, I's respectively. : 

Clearly, we can choose F'(x1, 2) as a polynomial, for instance. 

5.4. This surface 2 is a cross-section of a cable K. On the other hand, the 
surfaces 21, 22, 23, appearing in 


Ty, T2, Ts, Ti 
v3; = Py, Pe, P3, Pi + 
21, 22, Zs 


are cross-sections of certain cables Ki, Ke, K3. The fibers of these four cables 
through P;, P2, P; are sub-ares of T,, 2, '3 respectively. 

5.5. Let us pick a point P,;onT;. If P; is comprised in the cable Ki, then 
there passes through P, a cross-section =, of K;. If the intersection Py of 3, 
with I, is comprised in the cable K2, then there passes through Py a cross-section 
22 of Ke. If the intersection P; of 32 with I’; is comprised in the cable Ks, 
then there passes through P; a cross-section 2; of K3;. Let us denote by P, 
the intersection of 23 with T;. We have to prove that P, = P,, provided P, is 
close to P; on TY}. 

If the initial point P, is chosen close to P; on T, then the above construction 
can be carried out and the resulting points P2, P3, P, are close to Ps, Ps, P: 
respectively. On the other hand, if P, is close to P;, then P, is comprised in the 
cable K, introduced in 5.4. Let us denote by 3 the cross-section of K through 
P, and by P}, P3 the intersections of 3 with I, and I; respectively. If P, is 
close to P;, then the points P}, P} will be close to P: and P; respectively. 

Applying then the lemma of 3.1 to the cables K and Ki, K and K., K and Ks, 
there follows successively that P, = P?}, P; = P}, P, = P,, provided the points 
Pi, Ps, +--+ are close enough to P,, Pe, Ps; respectively. But this will be the 
case provided only P; is close enough to P;. Thus the case n = 3 is also 
settled. 

5.6. Let us consider finally the case n > 3. We pick a curve To, of the 
family #, distinct from T;, Ts, ---,I,. On Ip we choose a point Po, such 
that the (2x, z2)-projection of Po is different from the (2, x2)-projection of 








ON A CONVERSE OF KNESER’S TRANSVERSALITY THEOREM 763 


P(i=1,2,-°:, n). Then we choose surfaces 2% such that =} passes through 
Pp, and P; and is transversal there to Ip and I’; respectively, and such that the 
straight segment joining the (x, 72)-projections of Py and P; is comprised in the 
domain above which 2’ is defined (i = 1, 2, --- , n). 

The statement in 5.1 follows then immediately by applying the result, estab- 
lished in 5.3, 5.4, 5.5, successively to the chains 


Ti, Tiga, To, Ti 
Pi, Piss, Po, Pi 
Yi Dias Zi 
fori = 1,2, +++,” 


§6. A congruence relation 


6.1. Let P and P be two distinct points on a curve I of the family ®; and 
P* and P* two distinct points on a curve I* of the family ®. If there exists a 
pair of associated chains 


Yr, Yr, me. r,, Taw 
eo a (21) 
21, 22, one 9 ae 


and 
Ti, Ts, Ta, Tat 
Y =| P,, P.,---, Pa, Pau); (22) 
D1, Da +++ Zn 
such that 


l= I, Taw = I“, P, = P, Pay = P*, P, = P, Pry = P*, (23) 


then we shall say that the are PP of I is congruent to the are P*P* of I* and we 
shall write 


(P,P, T) = , Fr)" (24) 


6.2. There follows immediately from the fundamental lemma (see 4.5) that if 
P,P, , P*, I* are given, then there exists on I'* at most one point P* such that 
(24) holds. 

The statements contained in the following sections 6.3, 6.4, 6.5 are immediate 
consequences of the definition of the congruence relation. 

6.3. The congruence (24) implies 


(P, P, T) = (P*, P*, I). 





" See 0.5 concerning the origin of this definition. 


| 
| 
: 
: 











SE 


764 LINCOLN LAPAZ AND TIBOR RADO 


6.4. The congruences 
(P, P, T) = (P*, P*, T) and (P*, P*, r*) = (P’, P’, r’) 
imply 
P,P, 882. 
6.5. If = and > are cross-sections of the same cable K, then the ares inter- 
cepted by = and = on the fibers of K are all congruent to each other. 


6.6. The following question is important. Given P, P, T, P*, '*, does there 
actually exist a point P* on I'*, such that (24) holds? 


Let us pick a curve Ty of the family %, distinct from T and I*, and a point 
Py and To, such that the (21, 22)-projections of both P and P* are different from 
the (x1, 22)-projection of Po. We pass through P and Pp» a surface 2 which is 
transversal to I and ly at P and P» respectively, and through P* and P, a 
surface =* which is transversal to ['* and Ty at P* and P» respectively. If P 
is a point on I close to P, then (see 1.8) P gives rise to a figure 


(2, K, P, Po, r, To, P, Po) , 
where P) is close to Poon Tp. This point Py then gives rise to a figure 
(=*, K*, Po, ". To, r”; Py, P*) . 


The points P and Pp are on the same cross-section = of K, and the points P) and 
P* are on the same cross-section 5* of K*. We have then the associated chains 


I, To, I'* Tr, To, I* 
P, Po, P*| and | P, Po, P* 
Zz, z* z, 2* 
Consequently 
(P, P, T) = (P*, P*, r*). (25) 


That is to say: given P on I and P* on I*, there exists, for every point P on I 
close enough to P, a point P* on I'* such that (25) holds. 
6.7. Suppose we have a congruence 


(P, P, T) = (P*, P*, r+). (26) 


By definition, we have then a pair of associated chains V, V of the form 


Ss hte °*+ pha Ek” nr %} --+,;% 4" 
Y= P, Py ---, Pa F* th, v= PP ++ Peat 
a a x p> ee a 





r- 





ON A CONVERSE OF KNESER’S TRANSVERSALITY THEOREM 765 


Let P be any point on T between P and P. The same construction as that 
used in 4.7 yields then a chain W of the form 


St ee ee 
VY =|P,P,,..., P,, P*], 
ee... & 


which is associated both with V and ¥. By the definition of the congruence 


relation there follows therefore: 
If we are given a congruence (26), then for every point P on I between P and P 
there exists a point P* on I'* between P* and P*, such that 


(P, P, T) = (P*, P*, r*) (27) 
and “ 
(P, P,T) = (P*, P*, I). (28) 
6.8. Suppose we are given two congruences 
(P,P, T) = (P*, P*, I) (29) 
and 
(P, P,T) = (P*, P*, r*), (30) 


such that P is between Pand PonT. Since, on account of 6.2, P* is univocally 
determined by the other elements of the congruence (30), we can think of P* as 
obtained by the construction referred to in 6.7. From (28) there follows there- 


fore: 
The congruences (29) and (30) imply (28). 


§7. The group 7(p) 


7.1. We pick a curve T> of the family ® and a point Po on To. The curve Ip 
and the point Po will be kept fixed in the sequel. Let us represent, in a one-to- 
one and analytic but otherwise arbitrary manner, the curve I) upon some interval 
p’ <p < p’’ of the values of a parameter p, in such a way that Po corresponds 
top = 0. We shall denote by Po(p) the point corresponding to the value p of 
the parameter. We have then, in particular, Py = Po(0). 

7.2. If p; and pe are small, then the points Po(p:) and Po(p2) are close to Po. 
On account of 6.6, we have then a unique point Pj on I, such that 


(Po(0), Po(p2), To) = (Po(e.), Po, To) - 


This point P> corresponds to a certain value p* of the parameter, and p* is a 
univocally determined function of p; and pz: 


p* = p*(pi, pe) . 
We have then 
(Po(0), Po(2), To) = (Polo), Po(e*), To) (31) 


identically for small values of p; and pe. 





Ff 


t 
t 
? 
: 
: 








ie tae 


See ee ee 





766 LINCOLN LAPAZ AND TIBOR RADO 


7.3. If p is small, Po(p) is close to Po, and hence we have a point P; on Pr, 
such that 


(Po(e), Po(0), To) = (Po(0), Po, To). 


This point Py corresponds to a certain value p~ of the parameter, and p- is a 
univocally determined function of p: 


p- = p-(). 
We have then 
(Po(e), Po(0), To) = (Po(0), Polo), To) (32) 


identically for small values of p. We can write (see 6.3) the congruence (32) in 
the form 


(Po(0), Po(o), To) = (Po(e-), Po(0), To) . (33) 
On account of 6.2, there follows from (31) and (33) the relation 


p*(p-(p), p) = O. (34) 


7.4. We are now ready to define the group r(p) with the properties stated in 
1.12. Let o be any surface-element, not parallel to the t-axis. Let P be the 
base-point of o, and I the curve of the family ® which is transversal to o at P. 
If p is small enough, we have then (see 6.6) a point P(p) on I such that 


(Po(0), Pole), To) = (P, P(e), T). 


Let us denote by o(p) the surface-element which is transversal to T at P(p). 
We denote the transformation leading from ¢ to o(p) by r(p). We assert that 
the transformations r(p) constitute a one-parameter group of contact trans- 
formations with the properties asserted in the theorem of 1.12. 

7.5. Condition II in 1.12 is obviously satisfied. Let us discuss now the 
analytic character of the transformation r(p). Let a surface-element o vary in 
the vicinity of some fixed surface-element o. Denote by P and P the base- 
points of o and o respectively, and by I the curve of the family ® which passes 
through P and is transversal there to ¢. We choose a curve I* of the family 
and a point P* on I* in such a way that the (x, x2)-projection of P* is different 
from the (#1, 22)-projections of Po (see 7.1) and of P. Then we choose a surface 
=* which passes through P) and P* and is transversal there to Ty (see 7.1) and 
I* respectively. We also choose a surface 2 with the following properties: 

I. = passes through P* and P and is transversal there to '* and I respectively. 

II. 2 depends analytically upon the five codrdinates of the variable surface- 
element co. 

Clearly, such a surface = can be defined in terms of a polynomial in 2, 22. 

We keep =* fixed, while = necessarily varies when o varies. The analytic 
character of r(p), as required by condition I. in 1.12, is then an obvious con- 








ON A CONVERSE OF KNESER’S TRANSVERSALITY THEOREM 767 


sequence of the condition of analyticity (see 1.11) and of the analytic character 
of the transversality 7’, provided we use the chain 


T,, r*, r 
Po, P*, P 
>*,> 


to construct the base-point P(p) of the surface-element o(o) which corresponds 
to o under 7(p). 

7.6. Let us verify that r(p) is a contact transformation. Consider any 
surface = (see 1.4). Let Pi, Pz be any two points on 2. Denote by Ij, lz the 
curves of the family ® which are transversal to 2 at P,, P: respectively, and by 
01, 72 the surface-elements which are tangent to = at those points. If p is small, 
then the transformation 7(p) can be applied to o,, and there results a surface- 
element o:(p) with base-point P;(p) located on T;. We have then (see 7.4) 


(Po(0), Po(0), I) = (Pi, Pi(p), rT) . (35) 


If p is small, then Pi(p) will be comprised in the cable K of which & is a cross- 
section. Let 2(p) be the cross-section of K through P;(p), and P} the inter- 
section of 2(p) with the fiber of K through P:. This fiber is a sub-are of Tz, and 
we have, on account of 6.5, 


(P:, Pi(p), T:) = (Pa, Pe, 1). (36) 
From (35) and (36) there follows (see 6.4) 
(Po(0), Po(e), To) = (Ps, P2, Ts). 


Hence (see 7.4) the point P} coincides with the base-point P2(p) of the surface- 
element o2(p) which corresponds to oz under 7r(p). Since o2(p) is transversal to 
lr, (see 7.4), there follows that o2(p) is tangent to Z(p). If we keep P; fixed and 
let P2 vary on &, then there follows that the surface-elements which are tangent 
to any fixed surface 2, are carried by 7r(p) into surface-elements which are 
tangent to a surface Z(p). Thus 7(p) is a contact-transformation. 

7.7. There remains to show that the transformations 7(p) constitute a group. 
Consider a surface-element o with base-point P and the curve I of the family ® 
which is transversal to o at P. If 91, p2 are small enough, then we can apply 
the transformations r(pi) and r(p*(pi, p2)) (see 7.2) to o. The base-points 
P(p,) and P(p*) of the resulting surface-elements o(p) and o(p*) are then both 
located on T (see 7.4). We have, by 7.4, the relations 


(Po(0), Po(o:), To) = (P, P(m), YT), 
(Po(0), Po(o*), To) = (P, P(o*), YT). 
Hence, by 6.8, 


(Po(o1), Po(e*), To) = (P(or), P(o*), T) . (37) 














768 LINCOLN LAPAZ AND TIBOR RADO 


We also have (see (31) in 7.2) 

(Po(er), Po(e*), To) = (Po(0), Po(e2), To) . (38) 
From (37) and (38) there follows 

(Po(0), Po(e2), To) = (P(m), P(o*), T) . 


This shows (see 7.4) that o(p*) corresponds to o(p:) under r(p2). That is to say, 
we have 
t(p1)T(p2) = 7(p*) , (39) 


where p* is the function defined in 7.2. If we apply (39) to the special pair 
pi = p-(p), p2 = p, where p~(p) is the function defined in 7.3, then there follows, 
by (34) in 7.3, that 7(p~)7(p) is the identical transformation. Thus the closure 
property and the existence of the inverse (both in the vicinity of the identity) 
are established. 


§8. Conclusion 


8.1. The reader will have noticed that the complications, which we had to 
contend with in §2 to §5, can be traced to our narrow formulation of the cable- 
condition (see 1.6). In the paper of Blaschke, the condition is stated (for the 
special case when the given transversality 7 reduces to orthogonality) in the 
following form: 

All those curves of the family #, which are orthogonal to a surface S with 
continuous first derivatives, are also orthogonal to a family of surfaces which 
depend continuously upon a parameter. 

The point we wish to stress is that Blaschke states the condition for surfaces 
which are only supposed to have continuous first derivatives. 

8.2. On the basis of this formulation, Blaschke is able to settle in about three 
lines the issue we were forced to discuss in §2 to §5. In other words: about 
half of our paper deals with complications which are due solely to the fact that 
we did not adopt the liberal formulation of Blaschke for what we called the 
cable-condition. We feel that this situation calls for an explanation. 

8.3. Let us consider the special case when the given family ® consists of 
straight lines, while the given transversality 7 reduces to orthogonality. Then 
the one-parameter family of surfaces, appearing in the statement used by 
Blaschke (see 8.1), consists of the parallel surfaces of the original surface S. 
The coérdinates of a surface S*, parallel to S, depend upon the partial deriva- 
tives of the first order of the coérdinates of S. If we assume, as Blaschke does, 
only that S has continuous first derivatives, then it is difficult to see how we can 
speak of the tangent planes of S*, which are however necessary if we want S* 
to be orthogonal to the normals of S. Furthermore, if the second derivatives of S 
are not available, then we cannot expect that there will exist some region which 
is simply covered by the normals of S. It seems then difficult to give an exact 
meaning to the geometrical notions involved in the proof of Blaschke. 





ON A CONVERSE OF KNESER’S TRANSVERSALITY THEOREM 769 


8.4. Thus it seems that it is necessary to work with surfaces which have, at 
least, continuous derivatives of the second order. The remark in 2.1 becomes 
then operative, and if we take into account a number of minor difficulties of a 
topological character, then it seems hardly possible to give an adequate pres- 
entation of the proof without considerably exceeding the length of the paper of 
Blaschke. 

8.5. In conclusion, we add the following remark. Let s be a first-order strip 
with base-curve C. Let us denote by r® the one-parameter family of straight 
lines orthogonal to s at the points of C. A curve C*, which intersects orthogo- 
nally the lines of r®, will be called a parallel curve of C. Then, as is well known, 
the segments intercepted on the lines of r® by C and by a parallel curve C* 
have equal lengths. Hence: zf C is a closed curve, then every curve C*, parallel 
to C, zs also closed. 

If we generalize this statement by replacing the family of all straight lines by a 
general four-parameter family © of curves, and by replacing orthogonality by a 
general transversality 7, then we obtain what may be called a strip-condition, 
in contradistinction with the cable-condition which is a surface-condition. 

Suppose then we start with the strip-condition. Obviously, the surface- 
condition and the fundamental lemma are immediate consequences of the strip- 
condition. Consequently, §2 to §5 would drop out, as far as the proof of the 
main theorem is concerned. On the other hand, in order to show that the result 
thus obtained is equivalent to the result obtained by previous authors, it would 
be necessary to show that the cable-condition is equivalent to the strip-condition. 
From this point of view, §§2 to 5 would appear as a proof of the fact that the 
cable-condition implies the strip-condition (the converse being obvious). 


Ouro State UNIVERSITY. 





al ] : 
: 
: 
: 
f 
: 











ANNALS OF MATHEMATICS 
Vol. 36, No. 3, July, 1935 


FOUNDATIONS OF THE THEORY OF LIE GROUPS 
By W. Mayer anp T. Y. THomas 


(Received March 20, 1935) 


CONTENTS 





PAGE 
EC POE EE COOOL LOLOL ALE OE OES SERS ONAN PID 771 
I. Topological preliminaries . 

S, Deeiceinh cepa ted MRO 65 55 5 5 650 in oo bb ceo i Feet. 771 
Fe CG a ing nin sn as bas cdveenhacsysceustae tien res nes eOUeia 772 
II. Coordinate groups 
a is, sss aca ee sd ord wath Wen Mare ane £6 eae Wane din 775 
Ss I WI I 5 oops evccnvacbemngecessnbeecesscqeeaeeneces 777 
III. Lie groups. Adjoint groups 
Gi IE oie cteseninse oon svn benienseress weee hes baatmensiagaedss sabe abe mewiees 778 
i Tn ae III, 5c 0.65.5 cine neues ode dap eae eka enema dee yank keaie 781 
I i nino. vos s nwk es onioeninees 4dgate ete ees ele a aeeee a asaeas aos 784 
IV. Representations. Covering groups 
8. The differential equations of the representation...................000c cece eeeeee 786 
ee I 5 ibis 05 cc ics pa de dahe ele OU what aa anak Pek tasabanend 788 
ER een Pen ee eT rr re ee 789 
11. Relations between a Lie group and its covering group..................eeeeeeeee 793 
OS. oink co viwecnencsedsburieediaes meeetunbenewcaces 794 
Se ee II, 6 dace enc cnnes cdekewsdarFe Mele edelds eaeeees Teekeeedens 795 
ey ee I 50: 45s 5:6 vince epicdicns fab dis od Seu res ban Pak ewes 796 
15. Integral theory of representations for compact groups.................6.+.00008: 797 
V. Fundamental theorems of Lie 
16. Construction of the group germ from the differential equations................-- 799 
Se ee Gr SE IIA, «5 oo osc ks nee cabS Dlcw cscs we ebwice sae eBere beet 805 
VI. Subgroups 
18. Subgroups of R,,-coordinate groups (m = 2)............0 cece cece eee e eee eeees 806 
19. Differential equations of the subgroup....................cececececececeneeeeees 808 
ee IU Ge Gent I oS encvnovccsinktesnavalesawenescbueoswatess 810 
21. Representations as subgroups of a matrix group..................00cec cece eres 813 
Bele A I ss 6550 50 be S orn cad Cwhwvaddeghaubunlen teh ie deewdombablebarcaleeerixe 814 
VII. Invariant subgroups 
23. Conditions for a subgroup to be invariant.................000c eee e cence eee eeeee 816 
24. Conditions for an invariant subgroup...................0cceceececeeeeeeseeeeees 817 
i oa Ind nd de napa eae eoetedie dik REE ue coh eae kee ahh ctiaeesestse* 817 
26. Two subgroups whose elements are commutable...................02eeeeeeeeeeee 820 


770 








THEORY OF LIE GROUPS 771 


Introduction. It has been our aim in the following pages to give the 
foundations of the theory of Lie groups which is necessarily a theory in the large. 
Most of the results obtained can easily be extended to the case of transforma- 
tion groups. The treatment has been given in considerable detail since as far 
as we are aware, all text books on Lie groups have adopted the local point of 
view and in so doing have failed to deal with groups in the strict sense of the 
word. It is difficult to specify what is new and what is not new owing to the 
extensive literature in groups; in this connection however especial mention may 
be made to the papers by O. Schreier whose method has been employed in the 
extension of the group germ to the complete group. We believe however that 
many of the procedures which we have used are original. The part of the 
theory which concerns the existence of sub-groups, invariant sub-groups, etc. 
has been brought into relation with the theory of Lie Algebra, the conditions 
obtained constituting in fact the definitions of the corresponding concepts of 
this algebraic theory. When this point has been reached the recent book by 
L. P. Eisenhart, Continuous Groups of Transformations, Princeton University 
Press, 1933, may be consulted for further developments. 


I. Topological preliminaries 


1. Topological spaces and subspaces. Let % be a topological space, i.e. 
a set of elements (points) in which neighborhoods are defined which satisfy the 
four axioms of Hausdorff. Denoting by U(a) a neighborhood of the element 
a of Y these axioms are as follows: 

(a) a C U(a), i.e. a is contained in U(a). 

(8) If b C U(a) then there exists a U(b) such that U(b) C U(a). 

(y) Ui(a) N U2(a) D U;(a), i.e. the intersection of any two neighborhoods of a 
contains a neighborhood of a. 

(6) If a ¥ b, then U(a) N U(b) = 0, ie. if a and b are different elements 
there exist neighborhoods U(a) and U(b) having no point in common. 

From axioms (8) and (y) it follows readily that the intersection of two inter- 
secting neighborhoods is an open point set, i.e. a point set each point of which 
has a neighborhood containing only points of the set. 

Two sets of neighborhoods {U} and {U’} of a topological space % are said 
to be equivalent if to any U’(a) there exists a U(a) C U’(a) and conversely. 
The topological properties of the space % are (by definition) unchanged by 
replacing any set of neighborhoods by an equivalent set. If an open point 
set A C Y%f is added as a new neighborhood of any of its points we obtain a set 
of equivalent neighborhoods as is easily proved. Use of such neighborhoods A 
will be made in the following without explicit mention. 

Let % be a subspace of Y%. Then %; becomes a topological space if to each 
point p C %; we define neighborhoods W(p) as the intersections 


W(p) = % N UG), 
where U(p) is any neighborhood of p in &. The proof follows readily. 














772 W. MAYER AND T. Y. THOMAS 


A second possibility of defining neighborhoods in %, is to take the components 
(connected point sets) V(p) > p of the above neighborhoods W(p). We shall 
prove this statement as it is not immediately obvious. 

Since p C V(p) by definition the axiom (a) is satisfied. 

To prove axiom (8) consider U(p) > V(p) > q where q is any point of V(p). 
Since the neighborhoods U satisfy the Hausdorff axioms there exists a neighbor- 
hood U(q) © U(p). Hence V(qg) C U(q) C U(p). From V(g) C U(p) and 
the fact that any point of V(q) is connected to the point p it follows that 
V(q) © V(p). 

Consider two neighborhoods Vi(p) C Wi(p) and V2(p) C We(p). Since 
the W are Hausdorff neighborhoods we have W3(p) C Wi(p) N We(p). Take 
q < V3(p) © W3(p). Then q is connected to p by a curve in W;(p) and hence 
by a curve in Wi(p) and W2(p). Hence gq C Vi(p) and gq C V,(p), ice. 
V3(p) C Vi(p) N V2(p) which establishes axiom (7). 

If p and q are different points of 2; there exist neighborhoods W(p) and W(q) 
such that W(p) N W(q) = 0. Hence V(p) N V(q) = 0, since V(p) C W(q) and 
V(q) © W(q). This estaklishes axiom (6) and completes the proof that %; is a 
topological space on the basis of the connected neighborhoods V(p). 


2. Topological groups. A topological group is a set of elements (points) 
which constitute a group with respect to a given law of combination and for 
which neighborhoods are defined satisfying the four axioms of Hausdorff (topo- 
logical space). 

In a topological group a continuous function F(a) of the elements a of the 
group can be defined if the set of dependent elements F is likewise a topological 
space. We say that F = F(a) is continuous at a if, for any neighborhood 
U(F) of F, there exists a neighborhood U(a) such that the map of U(a), ie. 
F(U(a)), lies in U(F). 

In particular we can speak of the continuity of the composition function of 
a topological group %. If b and a are two elements of the group % there exists 
an element f of the group such that 


(1) b =f-a, 


i.e. b is the “product” of the elements f and a. Now consider the function 
b’(a’) defined by 


(1’) b’ = 4. a’ . 

where f is fixed and a’ is an arbitrary element of the group. The function b’ is 
continuous at a if, for any neighborhood U(b) of b, there exists a neighborhood 
U(a) of a such that 

(2) U(b) > f-U(a), 


i.e. if the map of U(a) lies in U(b). If this relation holds for any two points b, a 
of the group %& we say that % is right continuous or that A is an R-continuous 





THEORY OF LIE GROUPS 773 


topological group. By replacing the left members of (1) and (1’) by a-f and a’.f 
respectively we arrive in an analogous manner at the concept of the L-continuous 
topological group. As there is only a formal distinction between the R and L 
groups we select the R-continuous topological group as the basis of the following 
discussion.' 

Let & be an #-continuous topological group and let b and a be any two ele- 
ments of &. Then a = f-b for some element f of the group. For any neigh- 
borhood U;(a) a neighborhood U;(b) can therefore be found such that the map 


f-Ui(b) of Ui(6) lies in U,(a), ice. 
(2’) U,(a) > f-Us(b), or f-Us(a) D U,(b) 


in consequence of (1), the elements f and f being inverse. The conditions (2) 
and (2’) state: To any neighborhood U(b) of b there corresponds a neighbor- 
hood U(a) of a such that the map f- U(a) lies in U(b) and to any neighborhood 
U(a) there corresponds a neighborhood U(b) such that U(b) lies in the map 
f-U(a). It follows from this fact and the continuity property of the group 
that the maps f- U(a@) corresponding to a fixed element a of the group satisfy 
the axioms of Hausdorff. Hence these maps f- l(a) constitute a set of neighbor- 





1 In fact to say that the group is R or L continuous is essentially a matter of notation; 
for example, an R-continuous group with connection a-b is an L-continuous group with 
connection bra if we define a-b = txa. It is only of significance to speak of the R-con- 
tinuity of a given L-continuous group, or conversely. 

A topological group may be L-continuous but not R-continuous, and conversely. We 
shall show this by means of the following example due to D. Montgomery. 

Consider the set of all (discontinuous as well as continuous) transformations of the 
interval 0 S$ x S 1 into itself and denote by M the metric space which is formed by this 
set when the distance between two such transformations f(x) and g(x) is defined to be the 
least upper bound of the absolute value of their difference on the specified interval. The 
space M is a group space if the combination of f(x) and g(x), for brevity f-g, is defined 
to be f [g(x)]. It will now be shown that the function f-g as thus defined is L-continuous 
but not R-continuous. If the distance from /f; to fe is less than e then by definition 
l fila) — So(x) | < ¢ for all z in the interval and it follows for any fixed g that fi [g(x)) — 
Je g(x)] | < efor every x in the interval. L-continuity has therefore been shown. In or- 
der to show that there is R-discontinuity two functions f and g will be defined as well 
as a sequence of functions approaching g: 


f(z) =z for0 S< xz < }, 

= $-—zfor}s 281, 

g(x) = x for0 Sz 31, 
1 

g(t) = 2 * forOS 251 


For all n, f ig,(3)] < 4 and since f [g(4)] = 1, it follows for all n that the distance from /-g 
to f-gn is at least $. This proves that the group is not R-continuous. 
? Taking b = f-a we will denote the map f-U(a) by S(b), i.e. 
(a) S(b) = f-U(a). 
By this definition b C S(b) so that the first axiom holds. 
To prove the second axiom take 





a 


: 
: 
- 
: 











ee 





saa 





774 W. MAYER AND T. Y. THOMAS 


hoods equivalent to the neighborhoods originally defined in the group A and can 
therefore be introduced as neighborhoods into the group A without in any way 
changing its topological structure. 

Now let a, b, c be any three elements of a topological group Y%. Consider 


(3) c= fa, a =9-b, 
3’) c! = fa’, a’ = gb’, 


where the last two equations, for fixed elements f and g of the group, define the 
functions c’(a’) and a’(b’) respectively. It follows from (3) and (3’) that 


(4) c = (f-g)-b, 
(4’) c! = (f-g)-b'. 


Since f-g is fixed in (4’) this equation defines the function c’(b’). If c’(a’) is 
continuous at a and a’(b’) is continuous at b it is evident that c’(b’) is con- 
tinuous at b. If for every element e C % and for a fixed element a C Y, the 
functions c’(a’) and a’(e’) are continuous at a and e respectively, then for any 
two points c and b of 2% the function c’(b’) is continuous at b. Hence, a topo- 
logical group possessing the property of reciprocal right continuity of all elements 
with respect to a fixed element is an R-continuous topological group. 





(8) b, C S(b). 


Then there exists an element a, C U(a) such that b; = f-a;. Since a, C U(a) and U(a) 
is a Hausdorff neighborhood there exists a neighborhood U(a;) such that 


(7) U(a) C U(a). 


By the group property there exists an element g such that a, = g-a. For g fixed, aj = g-a’ 
defines the continuous function ai(a’) and hence for any neighborhood U(a;) there exists 
a neighborhood ((a) such that 


(5) U(a) D g- O(a). 

From (vy) and (4) we have 

(e) U(a) D> g- O(a). 

By left hand multiplication of («) by the element f we obtain 
(5) S(b) > f-gU(a) 


on account of (a). But b; = (f-g)a since bi = f-a: and a; = g-a. Hence by the definition 
(a) we have (f-g)-U(a) = S(b:). From (¢) it therefore follows that S(b) D S(b:) which 
proves the second axiom. 

Let S,(b) and S2(b) be two maps defined by (a). Then there exist neighborhoods 
U,(b) and U2(b) such that Ui(b) C S,(b) and U2(b) C S2(b). From the third Hausdorff 
axiom there exists a neighborhood U;(b) such that U:(b) NM U2(b) D> U3(b). But there 
exists a S3(b) C U;3(b). Hence S; N S2 D> Ui N Uz D Us D Ss proving the third axiom. 

To prove the fourth axiom take a and b to be different elements. Then there exists 
a U(a) and a U(b) for which U(a) N U(b) = 0. But U(a) D S(a) and U(b) D S(b) for 
some S(a) and S(b). Hence S(a) N S(b) = 0. 


THEORY OF LIE GROUPS 775 


II. Coordinate groups 


3. R, and L,, coordinate groups. Let A be a point set homeomorphic to a 
neighborhood = of a point of the r-dimensional number space.* Denote by 
a;,-++,@, the coordinates of a system of coordinates defined in A by this 
homeomorphism. By a regular coordinate transformation of the coordinates 
a, --+ , @, of A we shall then mean a relation a, = f,(a), --- , @, = f,(a) defined in 
A with unique inverse a; = %,(@),---,a, = %,(@) such that the functions 
f and ® are continuous and possess continuous derivatives to a certain order 
m (= 1) inclusive. 

An r (= 1) parameter coordinate group is an R or L-continuous topological 
group possessing the following two properties. First, for each element a there 
exists at least one neighborhood U(a) which is homeomorphiec to 2. (It is 
easily seen that the totality of coordinate neighborhoods U is a set of Haus- 
dorff neighborhoods equivalent to the original neighborhoods). Second, the 
coordinate transformation which is thus defined in the intersection A of two 
coordinate neighborhoods U(a) and U(b) is regular.* 

The coordinate group is in no way changed if the coordinate system (a) of 
any neighborhood U(a) is replaced by a coordinate system (4) obtainable from 
the system (a) by a regular transformation. A (scalar) function f(a, --- , a,) 
possessing continuous derivatives to the order m S m will therefore have 
continuous derivatives to the order m; independently of the coordinate system 
adopted. 

Let b and a be any two elements of an r parameter coordinate group. Then 
b = f-a for some element f of the group and the relation b’ = f-a’ in which f 
is held fixed defines a function b’(a’) as in §2. Assuming that the coordinate 
group is R-continuous we can represent the function b’(a’) in a neighborhood 
A* C U(a) by ba = fa(ai, «++ , a) in terms of the coordinates b, and a, defined 
in the neighborhoods U(b) and U(a) respectively. If, at the point a, the func- 
tions f.(a, --- ,a@,) are continuous and possess continuous derivatives to the 
order m inclusive the group will be called an R,,-coordinate group. The Ly- 
coordinate group is to be defined in an analogous manner. It is to be noted 
that in this definition of the R, and L,,.-coordinate groups the (finite) number 
r (= 1) of the parameters is left unspecified as this number is immaterial in 
the following theory. We shall limit our discussion in the remainder of this 
section to the R,.-coordinate group as analogous remarks apply to the case of 
the L,,-coordinate group. 





3 By a neighborhood of the number space we understand any open point set of this space. 

‘In particular the points a and b may be coincident. If ¢ C U(a) then by axiom (8) 
there exists a U(c) C U(a). The image 2’ of U(c) in = is an open point set in = since 
U(c) is open in U(a). On account of the homeomorphism between U(c) and the open 
point set 2’ it follows that U(c) is a coordinate neighborhood (described by the coordinates 
of U(a)). Since the intersection A of the two neighborhoods U(a) and U(b) is an open 
point set, to any point c C A there exists a coordinate ngighborhood U(c) C A having 
coordinates described both by U(a) and U(b). This defines in U(c) a coordinate trans- 
formation. 

















774 W. MAYER AND T. Y. THOMAS 


hoods equivalent to the neighborhoods originally defined in the group X% and can 
therefore be introduced as neighborhoods into the group % without in any way 
changing its topological structure. 

Now let a, b, c be any three elements of a topological group A. Consider 


(3) c = f-a, a=q-b, 
(3’) ce! = fa’, @! = g-b’, 


where the last two equations, for fixed elements f and g of the group, define the 
functions c’(a’) and a’(b’) respectively. It follows from (3) and (3’) that 


(4) c= (f-g)-b, 
(4’) c’ = (f-g)-b’. 


Since f-g is fixed in (4’) this equation defines the function c’(b’). If c’(a’) is 
continuous at a and a’(b’) is continuous at b it is evident that c’(b’) is con- 
tinuous at b. If for every element e C % and for a fixed element a C Y%, the 
functions c’(a’) and a’(e’) are continuous at a and e respectively, then for any 
two points c and b of & the function c’(b’) is continuous at b. Hence, a topo- 
logical group possessing the property of reciprocal right continuity of all elements 
with respect to a fixed element is an R-continuous topological group. 





(8) b, C S(b). 


Then there exists an element a, C U(a) such that 6; = f-a,;. Since a, C U(a) and U(a) 
is a Hausdorff neighborhood there exists a neighborhood U(a;) such that 


(vy) U(a) Cc U(a). 


By the group property there exists an element g such that a; = g-a. For g fixed, a; = g-a’ 
defines the continuous function ai(a’) and hence for any neighborhood U(a;) there exists 
a neighborhood O(a) such that 


(8) U(a) D g- O(a). 

From (vy) and (4) we have 

(e) U(a) D g- O(a). 

By left hand multiplication of (e«) by the element f we obtain 
(5) S(b) D> f-gU(a) 


on account of (a). But bi = (f-g)a since b; = f-a; and a; = g-a. Hence by the definition 
(a) we have (f-g)-U(a) = S(b:). From (¢) it therefore follows that S(b) D> S(b:) which 
proves the second axiom. 

Let Si(b) and S2(b) be two maps defined by (a). Then there exist neighborhoods 
Ui(b) and U2(b) such that Ui(b) C Si(b) and U2(b) C S2(b). From the third Hausdorff 
axiom there exists a neighborhood U;(b) such that Ui(b) M U2(b) D Us(b). But there 
exists a S3(b) C U;(b). Hence S; N Sz D> Ui N Uz D Us D Ss proving the third axiom. 

To prove the fourth axiom take a and b to be different elements. Then there exists 
a U(a) and a U(b) for which U(a) M U(b) = 0. But U(a) D S(a) and U(b) > S(b) for 
some S(a) and S(b). Hence S(a) N S(b) = 0. 





THEORY OF LIE GROUPS 775 


II. Coordinate groups 


3. R,, and L,, coordinate groups. Let A be a point set homeomorphic to a 
neighborhood = of a point of the r-dimensional number space.* Denote by 
a;,-++,@, the coordinates of a system of coordinates defined in A by this 
homeomorphism. By a regular coordinate transformation of the coordinates 
a, --- ,@, of A we shall then mean a relation @ = fi(a), --- , @, = f,(a) defined in 
A with unique inverse a, = ,(@),---,a, = %,(@) such that the functions 
f and ® are continuous and possess continuous derivatives to a certain order 
m (= 1) inclusive. 

An r (2 1) parameter coordinate group is an R or L-continuous topological 
group possessing the following two properties. First, for each element a there 
exists at least one neighborhood U(a) which is homeomorphic to 2. (It is 
easily seen that the totality of coordinate neighborhoods U is a set of Haus- 
dorff neighborhoods equivalent to the original neighborhoods). Second, the 
coordinate transformation which is thus defined in the intersection A of two 
coordinate neighborhoods U(a) and U(b) is regular.* 

The coordinate group is in no way changed if the coordinate system (a) of 
any neighborhood U(a) is replaced by a coordinate system (@) obtainable from 
the system (a) by a regular transformation. A (scalar) function f(a, --- , a,) 
possessing continuous derivatives to the order m, S m will therefore have 
continuous derivatives to the order m, independently of the coordinate system 
adopted. 

Let b and a be any two elements of an r parameter coordinate group. Then 
b = f-a for some element f of the group and the relation b’ = f-a’ in which f 
is held fixed defines a function b’(a’) as in §2. Assuming that the coordinate 
group is R-continuous we can represent the function b’(a’) in a neighborhood 
A* C U(a) by ba = fa(ai, --- , @,) in terms of the coordinates b, and a, defined 
in the neighborhoods U(b) and U(a) respectively. If, at the point a, the func- 
tions fa(ai, --- ,@,) are continuous and possess continuous derivatives to the 
order m inclusive the group will be called an R,,-coordinate group. The L,- 
coordinate group is to be defined in an analogous manner. It is to be noted 
that in this definition of the R,, and L,,-coordinate groups the (finite) number 
r (= 1) of the parameters is left unspecified as this number is immaterial in 
the following theory. We shall limit our discussion in the remainder of this 
section to the R,,-coordinate group as analogous remarks apply to the case of 
the L,,-coordinate group. 





3 By a neighborhood of the number space we understand any open point set of this space. 

‘In particular the points a and b may be coincident. If ¢ C U(a) then by axiom () 
there exists a U(c) C U(a). The image =’ of U(c) in 2 is an open point set in = since 
U(c) is open in U(a). On account of the homeomorphism between U(c) and the open 
point set 2’ it follows that U(c) is a coordinate neighborhood (described by the coordinates 
of U(a)). Since the intersection A of the two neighborhoods U(a) and U(b) is an open 
point set, to any point c C A there exists a coordinate neighborhood U(c) C A having 
coordinates described both by U(a) and U(b). This defines in U(c) a coordinate trans- 
formation. 

















ee 








776 W. MAYER AND T. Y. THOMAS 


It follows directly from the definition of the R,.-coordinate group and the 
above remark concerning the derivatives of the scalar function f that the above 
functions f.(a1, --- ,@,) occurring in the definition of the group are continu- 
ous and have continuous derivatives to the m* order inclusive in the neighbor- 
hood A*. 

As shown in §2 the maps f-U(a) corresponding to a fixed element a of an 
k-continuous topological group constitute a set of equivalent neighborhoods in 
the group. Hence if b and a are any two elements of an R,,-coordinate group 
and if b = f-a then the map f-U(a) of a coordinate neighborhood U(a) is a 
neighborhood of the element b. Introducing the coordinates of U(a) as the 
coordinates of corresponding points of the neighborhood f-U(a) we see that a 
regular transformation exists between the coordinates of f-U(a) and any other 
coordinate neighborhood U(c) of the group which intersects f- U(a) in an open 
point set A, the transformation being defined throughout A. This follows 
directly from the definition of the R,-coordinate group. Hence the above co- 
ordinate neighborhoods f.U (a) can be taken as the coordinate neighborhoods of any 
R,,-coordinate group. 

Now let a, b, c be three elements of an R#,,-coordinate group YU. Thence = f-a 
and a = g-b for some elements f and g of &. Hence c = (f-g)-b. Holding 
f and g fixed we define from these three relations in the manner previously ex- 
plained the three functions c’(a’), a’(b’) and c’(b’) respectively. With reference 
to the coordinate form of these functions we now deduce the following relations 


0a OCq OAy 
(1) abs = da, dbs ’ 
as a consequence of the definition of the R,,-coordinate group. Hence the 
derivatives dc,/dbg exist and are continuous if the derivatives dc,/da, and da,/dbs 
exist and are continuous. This statement can be generalized to apply to deriv- 
atives up to and including those of the m** order. 

In particular for any two elements b and c of the akove group % the deriva- 
tives dc./dbg exist and are continuous if for any point e and a fixed point a of 
the group the derivatives de,/da, and da,/de, exist and are continuous at the 
point a. As this statement can be generalized to derivatives of order m we 
have the following result: Consider for any element e and for a fixed element 
a of an r parameter coordinate group the derivatives 


ae aa 
= and > 


SUERTE ————— h=1 cee m) 
dag, *-* dag, des, «++ dep, ( -< OANA aig 


determined on the basis of relations of the form e’ = f-a’ and a’ = f-e’ in 
which the element f is fixed. If the above derivatives exist and are continuous 
then the group is an R,,-coordinate group. 

A special case of an R,,-coordinate group occurs when the integer m is allowed 
to increase without limit. We shall call the corresponding group an R,- 
coordinate group. Still more particularly if we limit ourselves to the case of 





THEORY OF LIE GROUPS 777 


analytic functions we arrive at a group which we shall call an R4-coordinate group. 
It is evident that the above considerations apply to these latter groups when 
we make the appropriate modifications concerning derivatives and coordinate 
transformations. 

A group which is both an R,, and an L,,-coordinate group will be spoken of 
simply as an m-coordinate group; similarly a group which is an R, and also an 
L,-coordinate group will be referred to as an A-coordinate group. 


4. Differential equations of the group. Let % be an R,,-coordinate group 
and let ¢, b denote any two elements of Y%. Then the function c’(b’) is con- 
tinuous and has continuous derivatives to the m* order.’ We shall write 


0Ce i 
(1) — AG (c, b). 


The quantities A§(c, b) are of tensor character as implied by their indices. In 
fact if we make a coordinate transformation c — ¢ in U(c) and a coordinate 
transformation b — b in U(b) we have 


OC, O€q OC, db, 


2) dbp ~ dc. Ob, dbp 
hence 

—— ‘ at, db, 
(3) AG(é, b) _ A‘ ( ’ .* dbs 


Since c’(b’) possesses continuous derivatives to the order m, the A3(c, b) will 
have continuous derivatives with respect to the coordinates b, to the order m — 1; 
in particular if m = 1 these quantities are continuous in b. 

Let d denote the identity element of %. If c = b we deduce from the corre- 


spondence c’ = f-b’ defining the function c’(b’) that f = Gd. Hence ec. = b, 
and dcqa/dbg = 53. It follows from (1) that 


(4) Ag(c, c) = 55. 

Now from (1) of §3 we have 

(5) ° A§(c, b) = Aj(c, a)A3(a, 6) ; 

hence if c = b we obtain 

(6) AS(c, a)A}(a, c) = 5%. 

The matrix || A%(c, a) || is therefore inverse to the matrix || A}(q, c) ||. By 


equating the determinants of both members of (6) it follows, in consequence of 





; ‘If the points c and b are identical the function c’(b’) defines a coordinate transforma- 
tion of a neighborhood U(b). 


) 

















778 W. MAYER AND T. Y. THOMAS 


the fact that the A¥(c, a) and the A}(a, c) are continuous in a and c respectively, 
than the determinant 


(7) | Ag(a, b) | #0 


for any two elements a and 6b of Y. 

As solutions of (6) the quantities A¥(c, a) are given as rational functions of 
the A}(a, c), these latter quantities possessing continuous derivatives in ¢, to 
the order m — 1; hence the same is true of the quantities A¥(c, a). That is, 
the quantities A§(a, b) are continuous and have continuous derivatives to the order 
m — 1 ina, and b, where a and b are any two elements of Y. 

From the equations (1) of §3 we have 


(8) cE = A%(c, a)A}(a, b). 
There are the first fundamental differential equations of the group A. It follows 
from the derivation of (8) that the intermediary point a can be selected arbi- 
trarily; the relations (5) express the property of the quantities A§(a, b) which 
have this fact as a consequence. 

In the following we shall select as the arbitrary point a in (8) the identity 
element dé and write 


(9) A5(a, 4) = Afs)(a), A$(4,a) = AY(a). 
Then from (4) we have 

(10) A%;)(4) = AS? (a) = 8§; also from (6), 

(11) At, (a)AQ?(a) = AS” (a)AJ,)(a) = 89. 


In (10) and (11) we have made no explicit use of the fact that 4 is the identity, 
element, the selection a = & being made only on account of the uniqueness of 
the determination of this element. 

In consequence of (9) the system (8) becomes 


(12) i = At(c)AQ(6), 

which is the usual form of writing the fundamental differential equations. As 

shown above the quantities A%,) and A‘* in (12) possess continuous derivatives 

to the order m — 1; in particular for m = 1 these functions are continuous. 
The above equations with appropriate modifications concerning regularity 

and analyticity apply of course for the case of the R,,-coordinate groups and 

the R.-coordinate group respectively. 


III. Lie groups. Adjoint groups 


5. Lie groups. A Lie group is an Ry- or Ly-coordinate group for which 
m2 2. Hence in particular an R,,-coordinate group or an R,4-coordinate group 
is a Lie group. 





THEORY OF LIE GROUPS 779 


Consider the fundamental differential equations 


0Ca ” 

(1) ab, = AGnAs") 

of a Lie group with reference to coordinate neighborhoods of arbitrary elements 
cand bof the group. Since the A%,) and AY” in the right members of (1) possess 
continuous first derivatives we can form the conditions of integrability of these 


equations. This gives 
| Ez Aj. (ce) _ Maw Aé,(6) | A )(b)AY(b) 
3 Cs 


2 
” r fear aA) 





[exe = 0. 


Multiplication with A ¢,)(b)A/,)(b)A‘?’(c) changes (2) into an equivalent system, 
namely 


Bz A}, (c) a Sete) Ai, | A‘(c) 
6 


OCs 
aA‘?(b) aro) | : P 
-| i - “& Af,)(b)A?,)(b) . 


(2') 





Since (2’) hold identically for arbitrary elements c and 6 of the group it follows 
that 


— aA\y? 
(3) (42 abs ab. 248) At Ate = ci?),, 


aA? aA‘ 
(4 (r) 4 —~e (oe) _ ale 
4) ( ab, Aw) — > Aj ») As = Cf), 





in which the C’s are constants called the constants of composition of the group. 
The relations (3) —e (4) are equivalent on account of (11) in §4. It follows 
from (3) that the C2) satisfy the conditions 


(5) Cie), - —Ci2), 
(5') Cf2),,Che,) + Cf, C892 + Clo, C§22,) = 0. 


Now let Ce), be any set of (real) constants satisfying (5) and (5’). It can 
then be shown that there exists a set of analytic functions AY)(@) defined in a 
neighborhood U(0) of the r-dimensional number space which satisfy the differ- 
ential equations 





(3’) a aA? = C9), A AQ, 


dag 044 





H ; 
t 














vatl 


780 W. MAYER AND T. Y. THOMAS 
In fact such functions A wie are defined <4 the power series® 
(6) AY = = 64 + 5 5 Oley a + 3 F Ofey C{e dpa, +: 


convergent for all values of ds. Since A‘? nll = 6% the determinant | A‘?’| does 
not vanish in a suitable neighborhood U (0). In this neighborhood functions As, 
can therefore be defined from the A‘? by equations of the type (11) in §4. On 
account of the skew symmetry of the Cie i in their lower indices it follows im- 
mediately from (6) that 

(7) A Gia = G,. 


p 


Now take the given constants C{*’,, to be the constants of composition of the 
above Lie group and denote by A@,)(@) the analytic functions which are thus de- 
termined in the neighborhood U(0). Consider the equations 


(8) os = Ag)(@ AP (a) 
0dg 

where a lies in a neighborhood of the identity element 4 of the Lie group. The 
conditions of integrability of (8) are satisfied on account of (3), (4) and the 
analogous equations in the functions A, these latter equations being a conse- 
quence of (3’). Equations (8) with the initial conditions @ = 0 for a = 4d define 
uniquely a regular coordinate transformation d@(a) of a neighborhood U(0). 
Writing (8) in the form 


(8’) Af(a) “ = A2,(a) 


we see that the Af) become the mh: with respect to the a coordinate system in 
U(0); we observe that the conditions (84,/0ag)) = 53 are satisfied.’ Hence, coor- 
dinates & can be introduced in a neighborhood of the identity element of a Lie group 
with respect to which the A’s are analytic functions. The above coordinates a are 
called canonical coordinates. 

We now suppose that the functions A are analytic functions of the coordinates 
a, in U(d) and define in the group a set of (equivalent) neighborhoods as the 
maps f-U(d). Introduce the coordinates of U(d) as the coordinates of corre- 
sponding points of the neighborhoods f.U(d); then, as observed in §4, the 





® See, L. P. neent, Continuous wide of Transformations, Princeton, 1933, p. 55. 
The above functions Af B ) are denoted by U% g by Eisenhart. 

7 ca follows from equations (8) of §4 that the transformation a — a changes the Af,) into 
the Ab, i in accordance with the equations 


7a o 7) a 0a, 
A(«)(@) = Ab )(a) ats ( ) 
0 


Oa, 





but since the derivatives (da,/84,)) have the values 5, the above equations (8’) result. 








THEORY OF LIE GROUPS 781 


coordinate transformations which are thus defined in the regions common to 
two such neighborhoods, are regular. 

Now consider two arbitrary points b and c of the Lie group with coordinate 
neighborhoods U(b) = b- U(d) and U(c) = c- U(d) respectively. Then b’ = b-a’ 
and c’ = c-a’’ define the points of U(b) and U(c) which correspond respectively 
to points a’ and a’’ of U(d). Suppose ¢c = h-b and consider the function 
c’ = h-b’ where his fixed. Forc’ C U(c) and b’ C U(b) we havec’ = h-b-a’ = 
c-a’’ where a’, a’’ © U(4); this gives a’’ = é-h-b-a’, where Z is the element 
inverse to c, and this equation defines the relation between the coordinates 
c, =a, andb, = a, of corresponding points in the neighborhoods U(c) and U(b). 


Hence 
aCe s 
(9) rs = At, (c) AY) 


where the functions A in the right members of these equations are analytic. 
Taking b = d in (9) we have 
a = Af,)(c)63 = Afs)(c) 

in agreement with the definition of the quantities A/,) in §4. Hence the values 
of the functions A at a point of any coordinate neighborhood f-U(a) are equal 
to the values of the functions A in U(4) at the corresponding point of this latter 
neighborhood. Now the composition function c’(b’) such that c’(b) = cis a 
solution of (9); also the system (9) admits one and only one solution satisfying 
these initial conditions and this solution is analytic. Hence c’(b’) is an analytic 
function of the coordinates of the neighborhood U(b). 

The transformation relations between the coordinates of two intersecting 
coordinate neighborhoods f. U(a) are analytic. This follows as above since the 
equations (9), in which the A’s are analytic functions, are satisfied by the 
coordinates cz and bg of two such neighborhoods. 

We have now proved the following result: 7t is possible to introduce coordinates 
into a Lie group with respect to which the group will become an Ra-coordinate group. 


6. Properties of differentiability. Let c and b denote two arbitrary points 
of an R,-coordinate group. Put c = h-b and consider the function c’ = h-b’ 
for h fixed. If the group is connected we can join 4 to the point b by an analytic 
curve I which we shall represent by the function b’(¢) for 0 S$ ¢ S 1, ie. 
b’(0) = Gand b’(1) = b. Thenc’(t) = h-b’(t) will define an analytic curve I’ 
which will join the point A to the point c. Now consider the equations 
M Sen Agere) S 


dt = Ae (c’)A™(t) 


which are satisfied along I’, i.e. by the function c’ = h-b’(t) the A#,)(c’) and 
the A’(t) being respectively analytic functions of the coordinates of c’ and of 








ge RM 3 


ae 





pc ae 





782 W. MAYER AND T. Y. THOMAS 


the parameter ¢ of the curve I’. Since the curve I’ is a closed point set it can 
be covered by a finite number of coordinate neighborhoods Nj, --- , Ny where 
we may suppose, h C Ni, c C Nw, and that in the transition from h to c along 
I’ these neighborhoods are entered in the given order. It follows by a well 
known theorem in differential equations that within the neighborhood JN, the 
solution c’(t) of (1) is analytic in the initial values h,, i.e. the coordinates of h 
in N;. Passing into the neighborhood N» the function c’(t) is likewise seen to be 
analytic in the h, on account of the analyticity of the coordinate transformation 
existing between Ni and Ne. Continuing through the neighborhoods Mi, .-. , Ny 
we obtain finally that c’(1), i.e. the function h-b, is analytic as a function of the 
coordinates of the point h. Hence, a connected R4-coordinate group is likewise 
an La-coordinate group, i.e. it is an A-coordinate group (see §3). 

As an immediate consequence of the above result we have that any connected 
Lie group is both an R,, and an L,,-coordinate group (m = 2), i.e. it is an m-coor- 
dinate group. ‘This is seen by making the regular coordinate transformations 
(§5) by which the group is changed into an A-coordinate group and then trans- 
forming back to the original coordinate neighborhoods of the group. 

If the Lie group is not connected its component containing the identity ele- 
ment é is a connected Lie group and hence by the above result is an m-coordinate 
group (m = 2). 

Now consider the function c’ = a’-b, where 6 is fixed, of an m-coordinate 
group (m = 1); we can apply to this function a treatment analogous to that 
applied to the function c’ = f-b’ of §4. Thus we can define the quantities 


0c. 
— = B3(c,a 
dag g(C, @) 


Bés)(c) = B5(c, d) ’ BY (a) = a(d, a) ’ 
such that B3(c, a) are contravariant in the index a under transformations of the 
coordinates c, and covariant in the index 6 under transformations of the coordi- 
nates a,. The determinant | B4(c, a) | does not vanish for any two points c and 


a and hence the determinants | Bf,)(b) | and | B¥)(b) | do not vanish for any 
point b of the group. We have 


3(b, b) = 6%, hence B¢,)(4) = BY(a) = 64, and 
(2) Bé,)(b)BY?(b) = BS” (b)BYyy(b) = 55. 


Furthermore we have the relations 


(3) Ola _ Be (c)BY(a) 
0ag 


as the second fundamental equations of the group (left differentiability). As 
the conditions of integrability of these equations (m = 2) we obtain 





—se oo S 


—- ww ae © 


-~ 








THEORY OF LIE GROUPS 783 











aB\”? aBy? , 
(4) ( abs o 7 ) BeBe, Cte.) 
aB?, P aBe . P 
(4’) ( 7 Bi.) — a Bis) BY = Ci?) 


where the C’s are constants satisfying conditions of the form (5) and (5’) of §5, 
these two sets of conditions being equivalent on account of (2). 
In an m-coordinate group (m 2 1) the function c = a-b satisfies the system 


0Ce i ” 
abs - Aé,)(c) AS (b), 


(5) 

Ola = Be (c) B“(a) 

aa, (o) Y ’ 
i.e. the first and second fundamental equations of the group. On account of 
the existence and continuity of the derivatives in the left members of (5) it 
follows by the use of the mean value theorem that the function a-b is simultane- 
ously continuous in the two sets of variables a, and b,. Now by differentiation 


of (5) for m = 2 we obtain 


: (%:) n# Aine) B?,)(c)BY (a) A¥(b), 


da, \dbs) so 
ad (aca dBé,)(c) (o)(p) Bir) 
— {—*) = — A? (ce) AY’ (b) BS'(a). 
f (2) = BOO a7 oapoara 
The equality of the derivatives in the left members of these latter equations 
results from their existence and continuity; hence , 
aA %,)(c) aBé,)(c) 
6 (¢) BY oss (r) Al, 
(6) —_— (,)(c) —_ ( \(c) 


a Oe” oe 
(6’) = Af.) = ae, Bee) 

as follows by a simple calculation. The equations (6) and (6’) must be satisfied 
identically; also since the first and second fundamental equations are com- 
pletely integrable it follows that the system (5) is completely integrable. 

Since the function a-b is a solution of the system (5) we can state the following 
result. In an m-coordinate group (m = 2) the function a-b is (simultaneously) 
continuous in the two sets of variables a, and b, and possesses continuous deriva- 
tives in these variables to the m** order; in particular if the group is an A-coordi- 
nate group the function a-b is analytic in the variables a, and ba. 

Now from equations (4) and equations (3) of §5 we have 


€ (e) 
(2ae _ 2Ae) apy = Cf AAD AG 
ac, aC, € oT y € 














(or 


(e) (€) 


OC, dc, 











784 W. MAYER AND T. Y. THOMAS 


Hence from (5) we obtain 
(7) (9), AWAD AL) = Os), BOBO BE, 
and when taken at the point d these latter equations give 
a) a) 
(8) Cty) — —CtS)- 


Now let a be an arbitrary element and 4 the inverse element so that d-a = 4. 
Then 
dd. 


= 4(4, @) = BY(4) 





and hence the functional determinant | a4,/0d, | is different from zero. By the 
implicit function theorem we can therefore solve the equations d-a = 4 to ob- 
tain the function d(a); since the function d-a also possess continuous deriva- 
tives with respect to a, the function d(a) is continuous and has continuous first 
derivatives. 

Differentiation of the equation @-a = dé with respect to a, leads to the equa- 
tions 


(9) ols _ — Be, (a) AY(a) 
0a 
as the differential equations satisfied by the inverse function ad. On account 
of (8) the equations (9) are completely integrable; use will be made of this 
fact later. 
In view of the system (9) we have the following result. Jn an m-coordinate 
group, m = 2, the function a(a), defining the element inverse to the element a is con- 


' tinuous and has continuous derivatives to the m* order. In particular if the group 


is an A-coordinate group the inverse function d(a) will be analytic. 


7. The adjoint group. By taking the derivative of 
(1) (a-b)-c = a-(b-c) =¢ 


with respect to bs for an m-coordinate group (m = 1) and putting a-b = 4, 
b-c = v we obtain 


(2) Bea) (e) BY” (u) Af, )(u)AG(b) = Af) (y) AS” (v) Be, )(o) BY (0). 


In these equations we set c = 6; then v = dG and yg = a and (2) becomes 


(3) Bi, (a) HH‘), (uJAG(b) = Az)(a)BY(b), 
where 
(4) H‘%))(u) = BS? (u)A,)(u) . 


Multiplying (3) by BY”? (a)A®,)(b) we have 
(5) H‘*),(a-b) = H‘#))(a)H‘¢))(b) . 








THEORY OF LIE GROUPS 785 


Putting a = 6 in (2) we obtain 


(6) Hi,3)(a-b) = Hi (@Ho3 (0) , 
where 

(7) Hit}(u) = Bé.)(u)Ai(u). 
Between the matrices (4) and (7) we have the relations 
(8) HS} (a) H¢))(a) = 65. 


The matrices H defined by the equations (4) or (7) constitute a group which 
by the correspondence a — H(a) is (multiply) isomorphic to the m-coordinate 
group; either of the groups H(a) is called the adjoint group of the given m-coor- 


dinate group. 
For the purpose of later application we shall give a second derivation of the 


adjoint group. Consider the point transformation xz — zx’ defined by the 
equation 


(9) x! = (a-2)-d 


corresponding to an arbitrary but fixed element a. Differentiation of (9) with 
respect to 2 gives 


(10) a = Be) (x')BY(a-2)A P,)(a-a)A (2) : 


If in (10) we put z = 4, then x’ = d by (9) and we have 
(11) (#*), = H'9)(a). 


0X~g 


It follows immediately from (8) and (11) that 


(12) (F) = Hi‘ (a). 
Ox, /é 


Ty 





Now make a second transformation 
a!’ = (b-x')b = [(b-a)-z](ba) ; 


ac) 2) (= 
da, Ja ax, /a \axg/)a’ 
and from this we deduce the relation (5). 
In consequence of (11) we see that the quantities H‘fj)(a) define the trans- 


then 








formation of contravariant vectors in a induced by (9), i.e. 


(13) Ni) = (qn, 











786 W. MAYER AND T. Y. THOMAS 


Similarly the H;‘3)(a) define the covariant transformation in a: 


(13’) Ua) = HS} ’(a)ucg) . 
From (7) and (8) in §6 we deduce immediately 
(14) CS) = CH AQAS AG 


i.e. the C tensor is unchanged under transformations (9). 


IV. Representations. Covering groups 


8. The differential equations of the representations. Let a denote a (square) 
matrix group, the elements of the matrices of a being continuous functions of 
the points a of an m-coordinate group Y%. If a is isomorphic to Y, i.e. if to any 
element a C % there corresponds a unique element a(a) C a such that 


(1) a(a)a(b) = a(a-b) 


and if furthermore a(é@) = 1, where J denotes the unit matrix, the matrix group 
ais called a representation of the group Y. 

If the elements of the matrices a(a) possesses continuous first derivatives 
with respect to the coordinates a, of points a C & we shall speak of a asa 
differentiable representation of 2.8 In this case we can differentiate (1) with 
respect to bg and then put 6 = 4 so as to obtain 


(2) da(a) _ sabi aa, =e =... 


00a da, 











8 It is sufficient to require that the elements of a(a) have first derivatives in the neigh- 
borhood of some point 6 and have continuous derivatives at this point. Consider 


a(a)a(b) = a(c), c=a-b 


in which the point a is fixed so that b = d-c isa function of c. Let one of the coordinates ca 
change by amount Ag. Then denoting by A; the corresponding change of the coordinate 
b; we have 





a(a) > A(bi, ++ bia, be + Ar, ber + Arya, = — A(di, «++ bs, Beas + Aur, * -+) ae 
t=1 ; ‘ 


e A(c1, +++, Cat May ***y Cr) — A(C1, +++, Cr) 
= = , 





By the mean value theorem the left member becomes 


7. AM(bi, +++, Dea, De + OAs, Begs + Acgt, +++) Ae 
a <n 
ae ab, A. 


t=1 





where of course @; depends on the element of the matrix a. Letting 4. 0 we now have 





aa(c) aa(b) aby, 
0Ca = a(a) 0b, AC 


i.e. the existence and continuity of the derivatives in the left members of these equations 
is established owing to the existence and continuity of the derivatives in the right members. 





THEORY OF LIE GROUPS 787 


It follows immediately from the equations that the matrices a(a) have con- 
tinuous derivatives to the order m and are in fact analytic if & is an A-coor- 
dinate group. The constant matrices n,,) characterize the representation a in 
the sense that a(a) is uniquely determined by (2) and the initial conditions 
a(é) = 1. We shall call the matrices 1, the characteristic matrices of the repre- 
sentation a. 

By multiplying (2) on the left with an arbitrary constant matrix m we see 
that ma(a) is a solution of (2) having the initial values m. Hence the existence 
of the above solution a(a) such that a(@) = 1 has as a consequence the complete 
integrability of the system (2). Calculation of the conditions of integrability 


of (2) gives 
(3) Nyy Mey — Nyy My + My) Cl2,) = 0. 


The adjoint group of an m-coordinate group (m = 1) is a special representa- 
tion and for m = 2 this representation is differentiable. Owing to the impor- 
tance of this latter case we shall give the explicit derivation of the differential 
equations of the adjoint group. From the conditions (6) of §6 we have 














aB?,) : oa 
Hence by differentiation of the equations 
(4) Hiss) = Bi, Av” 
we obtain 
0H,’ aA”? 0A’? 
—_—_—- = — ce, AP AW B’ o 
ac, ac. (Aye + Bip) ac, 
aA” aA” € v Qa € sd v a 
” | i ose Bi) ae Cte AS A® Bé,) = HS) Cte) AS 3 or 
0H‘? — 
5) HS 8 HP Cley A”, 


corresponding to the equations (2); thus the characteristic matrices 
Ta) _ | Crs) || 


for the adjoint group defined by (4) and the integrability conditions (3) are 
identical with the conditions (5’) of §5. 

The order of the matrices a(a) of a representation will be called the dimen- 
sionality of the representation. In the case of a one dimensional representa- 
tion A(a) the conditions (3) reduce to 


(3') My Clara) = 0 


and hence a necessary condition for the existence of a one dimensional repre- 
sentation of % is that (3’) admit a non-vanishing solution n,,)._ The (continu- 

















788 W. MAYER AND T. Y. THOMAS 


ous) function A(a) defining a one dimensional representation has no extremum in 
% or A(a) = 1. If ais an extremal point, then 


(6) A(a) 2 A(a’), (or A(a) = A(a’)) 


for all points a’ of a certain neighborhood U(a). By multiplying (6) by A(b) 
we have 


A(b-a) = A(b-a’), (or A(b-a) S A(b-a’)) 


in the neighborhood b- U(a) of the point b-a. But b-a is any point of Y and 
hence all points of %f are extremal points. Hence A(a) = A(4) = 1 for any 
poing a C % ( connected). 

If a(a) is a representation of 2 then the determinant A(a) of the matrix a(a) 
defines a one dimensional representation of 2. Assuming that the representa- 
tion a(a) is differentiable we find from (2) that 


oA 
(7) ad, = Ant.,)x AW (a). 


In a compact space the continuous function ‘A(a) has an extremum so that by 
the above result A(a) = 1. Then it follows from (7) that 


(8) Mv) i = 0 


is a necessary condition for a(a) to be a representation of a compact group 
space. 


9. The group germ. By a group germ will be meant a neighborhood H of 
the identity element 4 of a topological group such that the inverse of any 
element of %{ is contained in %; in the case of a coordinate group it will be 
understood that the group germ is a coordinate neighborhood. Owing to the 
continuity of the inverse function G(a) of an m-coordinate group (m = 2) it is 
evident that a group germ will always exist for such a group; a group germ will 
likewise exist for a Lie group (see §6). 

Now consider the system (2) of §8 for a Lie group, the conditions of integra- 
bility (3) of §8 being satisfied by the characteristic matrices. Let U(d) be any 
simply connected coordinate neighborhood of the identity element 4. Then (2) 
admits a unique solution a(a) defined throughout U (4) such that a(a@) = i. 
Let { C U(4) be a group germ such that the product %.H is contained in U(d). 
Then the solution a(a) satisfies (1) of §8 for a, b CM; this follows since each 
member of (1) is a solution of (2) with the same initial value a(a). 

The above solution a(a) satisfying (1) of §8 for a C H will be called a repre- 
sentation of the group germ i. We have proved the result: In any Lie group 
there exists a group germ X such that any solution a(a) of the completely integrable 
system (2) of §8 for which a(Gé) = 1 is a representation of the germ I. 








€ 
g 


THEORY Of LIE GROUPS 789 


10. The covering group. Consider a connected Lie group &% and a group 
germ {. We shall now construct the covering group %; of % with respect to 
§{ in the following manner.’ 

Denote by A any ordered set of a finite number of elements of I, i.e. 


(1) A = (aa2 --- a), itn +: te CE. 
Let 1, be the set of all sets A.” If 

(2) B = (bibs --- b,), bi, bs, --- 56, CF 
is any other element of 21, the product A -B is defined by 

(3) A-B = (a, --- ab --- by); hence 

(4) A-BC%M if A,B CM. 


In consequence of (3) we have the relation 
(5) (A-B)C = A-(B-C) 


between any three elements A, B, C of %. 
If a,b CA and if a-b C AW the product a-b will be called a simple product. 
Hence if a-b is a simple product we have 


(6) a-b=c where a,b,c CH. 


Then between the corresponding elements A = (a), B = (b) and C = (c) of % 
we define the relation 


(6’) A-B=(C. 
The simple product relation (6’) is the only relation assumed to exist between 
the elements of %,. Hence the elements (1) and (2) of %; are equal, if and ¢ 


only if, it is possible to pass from the set (a; --- a,) to the set (b; --- b,) by the 
process of changing brackets and the use of simple product relations (6’). For 
example the elements (abcde) and (uvwx) are equal if the following indicated 
operations can be performed: 


(abcde) = (ab)(c)(de) = (u)(c)(de) 
= (u)(vw) (de) 
= (u)(vw)(z) 


= (uvwe). 





We now complete the proof that %, isa group. By definition the product of 
two elements of %; belongs to %, and also the associative law (5) is satisfied. 





* This idea was first used by O. Schreier to construct the covering group of a given 


group in his paper, Abstrakte Kontinuierliche Gruppen, Abh. Mathem. Seminar d. Hamburg 

Univ., 1926, p. 21. ae 
1° We distinguish between the element a of %{ and the ordered set (a) which by defini- 

tion is an element of %. 








a! 
49) 
45 
Hele 
' 





790 W. MAYER AND T. Y. THOMAS 


It remains to show the existence of a unit element A Cc %, and that any ele- 
ment of %f; has an inverse belonging to %. Obviously A = (4). In fact from 
(1) and (3) we have 


A-A =(a) --+ aya) = (a; «++ ay-1)(a,)(4) = C-(A;- A) 


where 

Ps ae A, = (a), A = (4). 
But a,;-dé = a, is a simple product so that A,-A = A;. Hence 
(7) A-A=C-A, =A. 


In a similar manner it is proved that 
(7’) A.A =A. 


Consider the element (1). Since the germ % contains the element inverse to 
each of its elements there exists in %{, the element 


(8) A = (@ --- &). 
We can now show that 
(9) AA oe Bit e'Z. 


The proof of (9) is immediate for ¢ = 1; as the proof is the same for any value 
of t = 2 let us take for definiteness the case t = 3. We then have 


A.A = (a:d2) (a3@3) (ded). 
Since as-3 = 4 is a simple product it follows that (as@3) = A3-A3; = A. Hence 
A.A = (aia2)(G2) = (a1) (a2d2)() = (aa) = A. 


The second equation (9) follows in a similar manner. Hence %, 7s an abstract 


group. 
By the correspondence 


(10) A= (a2 te a;) — A-Ag- +++ -Q 


we coordinate to any element A C Y, just one element of &. Indeed any 
change in (a, --- a,) which leaves A unchanged (bracket changes and use of 
simple product relations) leaves likewise unchanged the corresponding element 
of Y. Now if & is connected it can be shown that any element of % is given as 
the product of a finite number of elements of % (see O. Schreier, loc. cit.). 
Assuming that % is connected any one of its elements is therefore the corre- 
spondent of at least one element of %;. Hence, the correspondence (10) defines a 
(multiple) isomorphism between the groups %, and YX. 

Denote by I, the subset of elements of %, which can be written as sets of a 
single element of I, i.e. 9; consists of all elements (1) for whicht = 1. Between 
the elements (a) of %; and the corresponding elements a of H there is a (1, 1) corre- 








THEORY OF LIE GROUPS 791 


spondence. In fact to A = (a) there corresponds only the element a. Con- 
sider an element B = (b) such that A ~ B. Thena = b and hence if a corre- 
sponds also to B then to B would correspond two different elements of H, 
namely a and b, which gives a contradiction. 

In consequence of the (1, 1) isomorphism existing between %{; and the germ 
{ we can define as the neighborhoods U(A) of an element A = (a) C H, those 
point sets corresponding to neighborhoods U(a) c yt by this isomorphism. On 
the basis of the neighborhoods U(A) so defined %, is a topological space. 

Now let B denote any element of 2; and define the neighborhoods U(B) by 


(11) U(B) = B.U(A), 


where U (A) is any neighborhood of A as above defined. We shall now show 
that the neighborhoods U(B) so defined in %, are in fact neighborhoods in the 
sense of Hausdorff. 

The first Hausdorff axiom is evidently satisfied since B C U(B). The second 
axiom can be proved in exactly the same manner as in §2. 

If, now, Ui(B) = B.U,(A) and U2(B) = B.UAA are two neighborhoods 
of B defined by means of the two neighborhoods U,(A) and U 2(A) , then there 
exists a neighborhood U. (A ) such that 


U;(A) € U,(A) N U;2(A). 
Multiplication on the left by B gives 
B.U;(A) = Us(B) C U(B) N U2(B), 


i.e. the third Hausdorff axiom is satisfied. 

Letting B and C be two different points of %, we now have to show that 
neighborhoods U(B) and U(C) exist which have no point in common. If this 
can be proved for the point A and any other point A C Y, it will then follow 
for any two different points B, C C % by multiplication of the neighborhoods 
U(A) and U(A) by the element B. 

Consider a neighborhood U(A) C %. If U (A) > A we can find neighbor- 
hoods 0 (A) and U(A) without a common element since 9, is a topological space. 

if A is not contained in the above neighborhood U (A) we proceed as follows. 
Choose a neighborhood U;(A) € U (A) such that the product A-B C U (A) if A, 
BCU,(A). This can be done on account of the (1, 1) isomorphism between 
Y%, and the germ 9 for which the analogous condition can be satisfied. For the 
same reason we can choose a neighborhood U 2(A) cv. (A) with the properties: 


A.B CU,(A) if A, B € U2(A), 
A Cc U,(A) if A € U2(A), 


where A is the deme inverse to A. We shall now show that the neighbor- 
hood A-U »(A) of A has no point in common with U;(A). If this is not the case 
there is a point B’ C A. U2(A) and B’ C U (A). Hence there is a point B C 











<a repent ee 





ps > ea 


790 W. MAYER AND T. Y. THOMAS 


It remains to show the existence of a unit element A Cc %, and that any ele- 
ment of 2%, has an inverse belonging to %. Obviously A = (4). In fact from 
(1) and (3) we have 


A-A =(a,-++ ad) = (a1 --+ a1)(a,)(4) = C-(Ay- A) 


where 

C = (a --- a1), A: = (a), A = (A). 
But a,;-dé = a is a simple product so that A,-A = A; Hence 
(7) A-A=C-A, =A. 


In a similar manner it is proved that 
(7’) A.A=A. 


Consider the element (1). Since the germ 9 contains the element inverse to 
each of its elements there exists in 2%, the element 


(8) A = (@ --- &). 
We can now show that 
(9) Pie ey eee A 


The proof of (9) is immediate for ¢ = 1; as the proof is the same for any value 
of t = 2 let us take for definiteness the case t = 3. We then have 


A-A = (a,Q2) (A343) (ded). 
Since a3-d3 = d is a simple product it follows that (as@3) = As-A3 = A. Hence 
A-A = (aya2)(G2q)) = (a1) (a2d2)(G:) = (aid) = A. 


The second equation (9) follows in a similar manner. Hence %; 7s an abstract 
group. 
By the correspondence 


(10) A = (aja2--- Q;) —> Q\-A2- +++ -Q 


we coordinate to any element A C Y%, just one element of %. Indeed any 
change in (a; --- a,) which leaves A unchanged (bracket changes and use of 
simple product relations) leaves likewise unchanged the corresponding element 
of 2. Now if & is connected it can be shown that any element of 2 is given as 
the product of a finite number of elements of { (see O. Schreier, loc. cit.). 
Assuming that % is connected any one of its elements is therefore the corre- 
spondent of at least one element of %;. Hence, the correspondence (10) defines a 
(multiple) isomorphism between the groups %, and A. 

Denote by %; the subset of elements of 9%, which can be written as sets of a 
single element of I, i.e. 9, consists of all elements (1) for whicht = 1. Between 
the elements (a) of 9; and the corresponding elements a of X there is a (1, 1) corre- 





THEORY OF LIE GROUPS 791 


spondence. In fact to A = (a) there corresponds only the element a. Con- 
sider an element B = (b) such that A ~ B. Thena = b and hence if a corre- 
sponds also to B then to B would correspond two different elements of H, 
namely a and b, which gives a contradiction. 

In consequence of the (1, 1) isomorphism existing between %, and the germ 
i we can define as the neighborhoods U(A) of an element A = (a) C 9%; those 
point sets corresponding to neighborhoods U (a) Cc ny by this isomorphism. On 
the basis of the neighborhoods U(A) so defined %, is a topological space. 

Now let B denote any element of 2%, and define the neighborhoods U(B) by 


(11) U(B) = B.U(A), 


where U (A) is any neighborhood of A as above defined. We shall now show 
that the neighborhoods U(B) so defined in %, are in fact neighborhoods in the 
sense of Hausdorff. 

The first Hausdorff axiom is evidently satisfied since B C U(B). The second 
axiom can be proved in exactly the same manner as in §2. 

If, now, U,(B) = B .U,(A) and U,(B) = B.UAA are two neighborhoods 
of B defined by means of the two neighborhoods U;(A) and U 2(A) , then there 
exists a neighborhood U. (A ) such that 


U3(A) € U,(A) N U,(A). 
Multiplication on the left by B gives 
B.U,(A) = Us(B) C U,(B) N U2(B), 


i.e. the third Hausdorff axiom is satisfied. 

Letting B and C be two different points of %, we now have to show that 
neighborhoods U(B) and U(C) exist which have no point in common. If this 
can be proved for the point A and any other point A C %, it will then follow 
for any two different points B, C C % by multiplication of the neighborhoods 
U(A) and U(A) by the element B. 

Consider a neighborhood U(A) C %. If U (A) D A we can find neighbor- 
hoods 0(A) and 0(A) without a common element since Iisa topological space. 

If A is not contained in the above neighborhood U (A) we proceed as follows. 
Choose a neighborhood U; (A) CU (A) such that the product A-B C U (A ) if A, 
BCU,(A). This can be done on account of the (1, 1) isomorphism between 
Y, and the germ 9 for which the analogous condition can be satisfied. For the 
same reason we can choose a neighborhood U 2(A) © U,(A) with the properties: 


A.B CU,(A) if A, B € U2(A), 
A CU;,(A) if A € U2(A), 


where A is the dunes inverse to A. We shall now show that the neighbor- 
hood A-U: »(A) of A has no point in common with U (A). If this is not the case 
there is a point B’ C A. U2(A) and B’ Cc U,(A). Hence there is a point B C 




















792 W. MAYER AND T. Y. THOMAS 


U 2(A) such that A-B = B’. By multiplication with B we then have A = B'R. 
But 


B'CU,(A) and 8BCU,(A) CU,(A),. 


Hence B’.B CU (A )so that A C U (A) in contradiction with the above hypothe- 
sis. As this proves the fourth Hausdorff axiom it follows that %; is a topo- 
logical group with respect to the neighborhoods defined by (11) in %. 

Now consider two arbitrary elements A, B C %, such that A = F-B and let 
A’'(B’) denote the function defined by 


A’ = F.B’, (F fixed). 


It follows directly from the definition of neighborhood in %, that A’(B’) iscon- 
tinuous (1) for B = A and A arbitrary and (2) for A = A and B arbitrary. 
Hence it follows that A’(B’) is continuous for both A and B arbitrary. The 
group 2; is therefore an R-continuous topological group. 

Let us define a coordinate system in I, taking as the coordinates of any 
point A C 9; the coordinates of the corresponding point a C 4. Introduce, as 
in §5, the coordinates of 9, as the coordinates of corresponding points of the 
neighborhoods F.-U (A). We shall show that the transformation relations be- 
tween the coordinates of two such coordinate neighborhoods U(A) and U(B) 
are regular. Let 


U(A) = A-U,(A), U(B) = B-U,(A) 


be two coordinate neighborhoods, the points A and B being any two points of 
%,. Consider the function B’(A’) defined by 


(12) B’ = K.A’, (K fixed), 


such that B = K-A. Then (12) implies B’ = B-P and A’ = A-Q where 
| ei Ew I, the coordinates of P and Q being the same as the coordinates of B’ 
and A’ respectively. Hence B-P = K-A-Q so that 


(13) P =G.Q, (G = B.K-A). 


Now P and Q are points of 9, but G is not necessarily a point of %. We can 
however select an intermediary point D C %; and write the transformation (13) 
as 


(13’) P = H.D, D= J-Q. 


The point D can be taken so that H, J C I. In fact if D = A, we have P = 
HC%. Also Q = J CH, and hence J = Qc 9, since 9, contains the inverse 
of each of its elements. Then the first of the equations (13’) defines P as a 
regular function of D C %, for HI fixed and second of the equations (13’) defines 
Dasa regular function of Q C I, for J fixed. Hence the coordinate transforma- 
tion defined by (13) is regular (m = 2). 

In particular if A’ = B’ the function B’(A’) defines a regular coordinate 


THEORY OF LIE GROUPS 793 


transformation between the coordinates of U(A) and U(B) in the intersection A 
of these neighborhoods. 

We have now shown that the covering group %; is a Lie group on the basis of the 
above definition of coordinate neighborhoods. 


11. Relations between a Lie group and its covering group. Suppose the 
group germ % of a Lie group 2 is connected and denote by 


S 
A = (ajQ2 --- a), a, d2,---,a,CAH 


an arbitrary point of the covering group %1. By assumption there exists in ¥ 
a continuous curve a,(t), 0 < ¢ S 1, connecting d and a,(s = 1,---,h). Hence 
(a,(t)) = A,(¢) connects A and A, in%. In the sequence of points of %f,: 


(a), (142), -+- , (@ +++ a), (@; +++ GjQj41), +++ , (Qy - ++ Gx) 


the first is connected to A. Assume that (a, --- a;) is connected to A. Then 
(a; «++ a;)- (a;4:(t)) defines a continuous curve in %; on account of the property 
of right hand continuity in 2%, and this curve connects (a; --- a;) to (a - ++ 4,41) 
which is therefore connected to A by the above hypothesis. Hence, if the group 
germ Wf of a Lie group is connected the covering group %, with respect to this germ 
is likewise connected. 

In the following it will be assumed that the covering group Y, is derived from 
a connected group germ. Since the group Y, is then a connected Lie group its 
composition function A -B possesses the property of simultaneous continuity and 
differentiability with respect to the elements A and B; likewise the inverse func- 
tion A(A) is regular (see §6). 

Let D denote the invariant subgroup of 2%, which is composed of the elements 
corresponding to dé by the isomorphism existing between % and %. Then D is 
a discrete point set in %,. The proof is as follows. If A is an accumulation 
point of points of D then a sequence A, — A of points A, C D would exist 
such that A, # Am form #n. Put C, = An-An-1. Then 


C's CD, C, # A, C, A. 


But this is impossible since in 9%; no point of D besides A exists owing to the 
simple isomorphism between the sets %{ and %. Hence the Lie group & is simply 
isomorphic to a factor group U,/D of A: where D is a discrete invariant subgroup 
of Mf. 

We shall now prove that D is a subgroup of the centrum of %1. This means 
that 


(1) B.A =A-B, AC®D, BC%. 


let Bit), 0 < t < 1, bea curve in %; connecting A and B; then 
Bi(t)-A-B(t) = A’(é) A, A’(t) CD 














4 S aaer be 
eee 





794 W. MAYER AND T. Y. THOMAS 


for any element A of D since D is an invariant subgroup of %. But the con- 
tinuous curve A’(¢) connects A and A’(1). Since D is a discrete group it fol- 
lows that A = A’(1) and hence (1) is proved. 

Now consider a representation a(a) of the group germ % used for the con- 
struction of the covering group %. We can extend this representation a(a ) 
defined for points of {{ to a matrix group NM. An element of M is defined as the 
product of a finite set of matrices 


(2) a(a;)a(a2) i i a(a,) a, Qe; 2 Qa 8 I. 


The set of elements (2) obviously constitutes a group. 

We shall now investigate the relation between 2%, and M9. Let the above ma- 
trix (2) correspond to the point A = (ad2 --- a) of %. Then to the point A 
corresponds only one matrix (2) since any simple product relation a-b = ¢ im- 
plies the corresponding relation a(a)a(b) = a(c) in the matrix group M. Hence 
the correspondence 


(@i@2 «++ Az) —> a(ay)a(ae) --+ ala) 


defines a (multiple) isomorphism between 2%, and M. The matrix group M deter- 
mined from the group germ I of A as therefore a representation of the covering 
group %; of X with respect to the germ A. 


12. The universal covering group. Two Lie groups will be said to be in 
the same class € if they have the same C tensor, i.e. the tensor having as its 
components the constants of composition of the group. Let LZ; and Lz be two 
Lie groups in the same class € and introduce canonical coordinates in these 
groups. In L; the canonical coordinates define a neighborhood J, of the origin 
of the r-dimensional number space and similarly in Le they define a neighbor- 
hood N2. The intersection N = N, fl Nz determines a neighborhood of the 


_ identity element of Z; and Lz in which the quantities A ¢,) of the two groups are 


the same functions of the canonical coordinates. Hence, there exists a neigh- 
borhood N* C N such that the composition function a-b for a, b C N* is the 
same for the two groups LZ; and Lz; this follows from the fact that in each 
case the composition function is given as the solution of the (same) completely 
integrable system, namely the system (12) of §4.1! Going out from this result 
it has been shown by O. Schreier (loc. cit.) that there exists a uniquely deter- 
mined Lie group & belonging to the class € which is a covering group for each 
group of this class. The group 9 is characterized by the fact that it is simply 
connected and called the universal covering group of the class G. 

It follows from the result previously established that any group % of the 
class € is simply isomorphic to a factor group %/D where D is a discrete in- 
variant subgroup belonging to the centrum of 9. 





11 See, W. Mayer and T. Y. Thomas, Existenzbeweis fiir Systeme totaler Differential- 
gleichungen, to be published elsewhere. 








THEORY OF LIE GROUPS 795 


Now let X = X/D be any group of the class € and H the adjoint group of W. 
Consider the decomposition of % into its co-sets defined by D, namely 


%=-D4+T744+ D504... 


andletd>a CA, b > b CA by the isomorphism between % and 9. We can 
now define a differentiable representation H, of YX by the following identification 


H\(D) = 1, H\(D-a) = H(a), H,(-b) = H(b), - 


Since H; is a differentiable representation of 9 it satisfies a system of equations 
of the form 

= H,(a) m%)A‘(a), ac Y 
But the elements of the characteristic matrices n, in these equations are iden- 
tical with a set of components of the C tensor defining the class in consequence 
of the above definition of the representation H;. Hence the representation 
H, is identical with the adjoint group A of the universal covering group %. Con- 
versely it is immediately seen that if D is any discrete invariant subgroup of 4, 
then %/D isa member of the class ©. This gives the following result: if ¥ is the 
universal covering group of a class © and H its adjoint group then any group % of 
€ is simply isomorphic to a factor group A/D where D is a discrete invariant sub- 
group of & such that H(D) = 1 and conversely any such factor group X/D is a 
member of the class ©. 


13. Locally true representations. A representation a(a) of a Lie group 
% will be said to be locally true if there is a group germ of Y the elements 
of which are in (1, 1) correspondence with the matrices of the representation. 
If the representation a(a) is a locally true representation of a Lie group % it will 
be a locally true representation of any group of the class € of %. In this sense 
we can speak of a locally true representation of a class C. 

A necessary and sufficient condition for a(a) to be a locally true representation 
of a class © is that the r matrices 1.) which characterize the representation be linearly 
tndependent, i.e. that 


(1) Nyt? = 0 : 





has &’ = 0 as its only solution. 
Let §? = e™ be a solution of (1) such that not all the e“ are equal to zero. 
Define a curve b,(t) through the identical point a by the equations 


dba a 

a = Aé,)(b) e”, 
From the equations (2) of §8 we then have da/dt = 0 along this curve; hence rf 
4 = const. for all points of the curve. Hence the representation a can not be 











796 W. MAYER AND T. Y. THOMAS 


locally true since points of the above curve lie in each neighborhood of the 
identity element. 

Now suppose that the representation a is not locally true. Then in any 
neighborhood of a there are different points a and 6b such that a(a) = a(b) and 
hence there exists a sequence of points c, — d, cn * d, for which a(c,) = 1. Let 
the point d and the point c, be connected by the curve 


a(t) = Ga + Enal, 
where the £4 are constants normalized so that 


, a Ena tne = _& 


Let the parameter ¢ for the point c, be denoted by T,. Then the matrix 
a(é + &t) = 1 fort = Oandt = T, and T, ~O0asn— ~. By the mean 
value theorem we have 


(2) » da(d + Tn) or 0<0<1, 
7 





when the value of @ may of course be different for different elements of the 
matrix a. Let € denote an accumulation direction of the directions £,. Then 
there exists a subset of directions —, which we denote by £é,- such that &, — & as 
n' — . For the points c,-, corresponding to the subset of directions £,, the 
equations (2) become 

(#) f= 0, or mg = 0 
as n’— «©. Hence the above statement is proved. 

The considerations of §11 and §12 enable us now to state the following result: 
af M zs a locally true representation of a Lie group A and if D is the set of elements 
of X corresponding to the unit matrix of M then D is a discrete invariant subgroup 
of X contained in the centrum of A. 


14. Reducible representations. Let a(a) be a differentiable representation 
of a Lie group % and denote by a;(a), where i, k = 1, --- , k, the matrix ele- 
ments of the representation. Any matrix a(a) determines a transformation 


Mis aj (a)rx* 


of the n-dimensional vector space. The representation a(a) is said to be re- 
ducible if there exists a proper vector space invariant under all transformations 
of the representation. If M‘ wherek = 1,---,n;p=1,---,mand0<m<n 
is a basis of the above invariant vector space of dimension m then by the above 
definition we have 


(1) a(a)M@ = Mb(a), or 
(1’) a;(a)M* = M3b°(a) 





THEORY OF LIE GROUPS 797 


as the necessary and sufficient condition for the reducibility of the representa- 

tion. The rank the matrix 2 will be m because of the independence of the vec- 

tors of the basis. Hence the matrices b(a) are uniquely determined by (1’) and 

are easily seen to constitute a differentiable representation of the group . 
From (1) we obtain by differentiation 


(2) Ny M = Mim) vy=1,---,r 


where 17) and m,,) are the characteristic matrices of the representations a and 
b respectively. Hence the vector transformations determined by the character- 
istic matrices 1) of a also leave invariant the vector space I. 

Conversely suppose that (2) holds where the 1,,) are the characteristic matrices 
of a representation a(a) of the connected Lie group %{. We shall prove that 
(1) is then a consequence of (2). The matrices m,,) are uniquely determined as 
solutions of (2) and are seen to satisfy the conditions (3) of §8. Hence there 
exists a representation b(a) of a group germ 9% of % having the M,) as its charac- 
teristic matrices. Now consider 


c(a) = a(a)M — Mb(a) 
defined in 9%. By differentiation we obtain 


oc 


— = an, MAY — Mbm,., Ay” 
0ag 


z (a) 
= am As? — Mbm, As? = cm As 


Now from c(é) = 0 it follows that c(a) = 0 for a C , i.e. (1) holdsin %. From 
this fact and the fact that any element of % can be expressed as the product of 
a finite number of elements of {{ it is easily seen that (1) holds throughout %. 
Hence, a necessary and sufficient condition for the reducibility of a representation 
a(a) of a connected Lie group Y is that the characteristic matrices 1.) be reducible. 


15. Integral theory of representations for compact groups. The integral 
theory of representations is based on the possibility of introducing an (invariant) 
volume integral in the Lie group. In each point of the group one can define a 
set of r independent covariant vectors ¢.)(a) by the equations 


(1) Cala) = AW(4) eey@s lena | ¥ 9, 
where e’s in the right members are the components of a set of r independent 
covariant vectors in the identical point. Now consider the integral 


(2) 1(V) = i | €(¢)a(a) | da; --- da,, 


the integration being extended over the integrable point set V with the absolute 
value of the determinant | ¢.).(a) | in the integrand. If f is any point of the 
group we shall show that 


(3) I(f-V) = 1(V), 

















798 W. MAYER AND T. Y. THOMAS 


where f-V denotes the point set {f-0}, v C V. Consider the point trans- 
formation 


(4) a’ = f-a, (f fixed) ; 
then 
oe 
dag = Aj,)(a )AQ(a) ° 
From this it follows that 
da, ; 
(5) €(a)a(@) = = C(e)a(a’) ; 
hence 
5" jecap(a) | = | 22] Jena’) |. 
0ag 








In consequence of the transformation (4) the integral I(V) becomes 


HV) = | leea(a)| 


Then by (5’) we obtain (3). 

If we replace the A\(a) by B®(a) in (1) the right member of (2) will be an 
integral which we shall denote by J(V). In the same way as (3) was proved 
we can now prove that 


(6) J(V-f) = J(V). 
Now consider the adjoint group defined by 
HG) = BY? Als). 


If the determinant | H‘f})| = 1 then the determinants | A‘) | and | BY? | are 
equal and hence I(V) = J(V). The above condition | H‘}, | = 1 is satisfied in 
any compact group (§8). 

Now let & be a compact Lie group and let f(a) be an integrable function de- 
fined in 2%. We shall then prove that 


da, ’ / 


da, --- da,- 











/ 
dQ 


(7) [ 10-0 | €¢e)a(a) | da, --- da, = [i | €¢e)a(a) | dai --- da,. 
aw 
In fact the transformation a’ = b-a (b fixed) carries the left member of (7) into 


[1 lecrala) 
bf 


8a | aq! ats da’. = [x | €¢)a(a’) | da, eSr da’. 
da u 


7 


If a(a) is a representation of the group % and if we put f* = aj then 


(8) fi(a-b) = aj(a)f*(b). (i,k =1,-++,n). 








THEORY OF LIE GROUPS 799 


Conversely suppose that we have a set of n functions f* which are linearly in- 
dependent and which have the property (8), then the quantities aj(a) in (8) 
define a representation of the group %. This follows immediately from 


f*((a-b)-c) = fi(a-(b-c)) 


and the condition of linear independence. To obtain such a set of functions 
f(a) let us construct the linear integral equation 


(9) g(t) =a” [ K(z, y) ely) w(y) dy: --- dy,, wly) = | eeraly) | 


with the symmetric function K(x, y) such that” 
(10) K(a-2, a-y) = K(za, y). 


From the theory of integral equations we know that (9) has characteristic 
values \ and solutions yg. Let \ be a characteristic value of multiplicity n; then 


there exists a basis ¢:(7), --- , g(x) of solutions of (9) corresponding to the 
above value of A. We shall show that gi(a-z), --- , gn(a-z) is also a basis. 
In fact 


paz) = ff K(a-2, y) gp(y) oly) dy --+ dy, 
= d | Ko-z,0-We,(e-v)o) dy, --- dy, 


=> [ Ke, wes(a-ve) dn -- dy 


by (7) and (10). Hence 
Gp(a-x) = af (a)gq(z) 


and from the preceding result a%(a) defines a representation of the group Y. 
For further developments of this theory reference may be made to the paper 
by F. Peter and H. Weyl, Math. Annalen, Bd. 97. 


V. Fundamental theorems of Lie 


16. Construction of the group germ from the differential equations. Con- 
sider a completely integrable system 


pn x ee AZ, (@)A¥(d), a,pB,y = 1,--- T; 





”% The above equation (9) can be written in the form 


oe) Vale) = > | Kee, 0) VRE) VET el0) VE dan de 
5) | 


in which the kernel K(x, y) \/a(x) \/a(y) is symmetric. 
The general solution of K(x, y) = K(y, z) and of (10) is: K(x, y) = f(g-z) + f(#-y). 














if 








800 W. MAYER AND T. Y. THOMAS 


in which the A’s are analytic in a domain D of the r-dimensional number space. 
It is assumed that the matrices || A¢,) || and || A? || are inverse so that the 
equations 


(2) At, (@A¥(a) = A‘ (a) A2s)(a) = 53 


are satisfied in D, the conditions of integrability (3) and (4) of §4 being likewise 
satisfied in this domain. By integration of (1) we shall derive the group germ 
and we shall show how this can be extended to an A-coordinate group in the 
following section. On account of the result established in §5 the above hypoth- 
esis of analyticity is no less general than the assumption that the A’s are con- 
tinuous and possess continuous first partial derivatives in the domain D. 

Any point d C D can be made the identity element of the A-coordinate group 
which we shall construct. For the arbitrary point 4 the equations (10) of §4 
namely 


(3) Afs)(4) = AY?(d) = 65 


will not be satisfied. It is possible however, to introduce new quantities A in (1) 
which will leave unchanged the functional relationships imposed by these equa- 
tions and which will moreover be such that (8) holds. Put 


Af. )(a) — Aé,)(a)AY (4), 
Aa) = Ag (@AQ(a). 


The matrices || Aé,) || and || A4® || are inverse and the conditions (3) are satisfied 
by the A’s. Also it follows from (1) and (2) that 





oF = Ae,(6) 490). 

8 

Hence we can suppose without loss of generality that the conditions (3) are 
satisfied by the functions A in (1). 

In the following we shall mean by a neighborhood U(d) of the point d C D 
any open point set S such that d C S C D. From the theory of completely 
integrable systems of differential equations we know that there exists a unique 
solution ¢a(a, b) of (1) defined by convergence power series in a and 6 ina 
neighborhood U(d), i.e. for a, b C U(d), and such that* 


(4) ga(a, d) = Aq. 
Writing these equations in the symbolic form 
(4’) a-éd= a, a C U(@4), 


we bring into evidence the character of the element é as a right handed unit 
element. In general we shall employ the symbolic product a-b to represent 
the above solution ¢,(a, b). 





18 See W. Mayer and T. Y. Thomas, loc. cit. 


THEORY OF LIE GROUPS 801 


Given any neighborhood U;(4) there exists a second neighborhood U2(4) such 
that 
a-b C U,(4) for a,b C U,(4). 


This follows from the simultaneous continuity in a and b of the above solution 
¢a(a, b) of (1). 

Corresponding to the relation (4’) we now show that 
(5) d-b = 6, b C U(4), or 
(5’) da(d, b) = ba, 
the left members of these latter equations denoting the solution of (1) which 
becomes da for b = d. But this property is possessed by the function ¢. = ba 
which is a solution of (1) and since the above conditions determine the solu- 
tion of (1) uniquely it follows that (5’) exists and hence we have (5) as the 
symbolic representation of (5’). 

In a similar manner we can show that 
(6) (a-c)-b = a-(c-b), 
where a, b, c © 0(4) © U(4) and O(4) is such that the product of any three 
points of U(d) exists. Writing (6) in the form 
(6’) oa(¢(a, ¢), b) = oa(a, $(c, b)), 
the left members of these equations represent, by definition, that solution of (1) 
which for b = & assumes the initial values ¢.(a, c). But 


Aba(a, P(c,b)) _ Adala, $) A¢,(c, b) 


db, ad, abs 
At, )(o(a, ¢))AY()A?,)($) A$") 
= At,)($(a, o(c, b))AY(d). 


Hence the right members of (6’) represent a solution of (1) and in fact a solution 
such that 





[pa(a, o(c, b))|,-2 - dala, c). 


Since the solutions of (1) given by the left and right members of (6’) assume the 
same initial values they must be identical. 

We shall now show the existence of the inverse elements. For this purpose 
we consider the above neighborhood 0(4) C U(4) such that for any three 
points a, b, c © O(a) the two members of (6) exist and are contained in U(4). 
Hence we have 


(7) a(o(a, b), c) = bala, $(6, c)), a, b,c C U(4). 

















802 W. MAYER AND T. Y. THOMAS 


Now differentiate (7) with respect to ag to obtain 


Abalp(a, b), c) AGy(a,b) _ Adala, $(b, c)) 








(8) 





ab, daz dag ’ 
and in these equations put a = d. Then 

a @ b, Y a , 
(9) Gals ©) B74.(b) = Besy((b, 0). 

ab, 
where we have put 
Ad. (a, b = 

(10) (mee = Bey(b). 


From (4) and (10) we have 

(11) Bis)(4) = 83, 

so that it is possible to select a neighborhood of a such that the determinant 
(12) ! | Béy(b) | #0 


for all points of this neighborhood. The intersection of this neighborhood with 
O(a) we shall continue to denote by (4). Hence in (4) we can define the ma- 
trix || BS” || inverse to the matrix || B?,) || so that 


(13) Bé,)(B)BS?(b) = BS” (b)B?,)(b) = 4%, bc U(4). 
Multiplying (9) by B\’(b) gives 





(14) = t é Bés)(¢) BY(b), b,c C U(A). 
As the conditions of integrability of (14) we obtain conditions of the form (4) 
and (4’) of §6 either of which is equivalent to the other on account of (13). 
Between the C’s involved in these latter conditions of integrability and the C’s 
involved in the conditions of integrability of the system (1) we have the rela- 
tions 


(15) Cian = — CS, 


these relations being obtained by the procedure employed in §6. 

We now have the necessary relations to establish the existence of the element 
inverse to any element b of a certain neighborhood of 4. We denote the in- 
verse element by 6 and define it by 


(16) b-b = d; 
or 


(16’) o(b, b) = dg. 





THEORY OF LIE GROUPS 803 


We shall derive from (16’) the system of differential equations for 6. as func- 
tions of the coordinates of b. Differentiation of (16’) with respect to b, gives 


“4 


Bi,)(¢) Bs(b) = + Af,)(¢)AQ(b) = 0. 


Since ¢ = 4 it follows from (16’) “" (3) and (11) that these equations become 


= 


By (6) = 5 A‘(b) = 


Multiplication of the latter penta by B?,)(6) leads to the desired system (9) 
of §6: 
ab 


(17) Sh = — Bio (6A). 


The system (17) is completely integrable on account of (15). Hence (17) ad- 
mits an analytic solution 6(6) C O(a) uniquely determined by the condition 
b(4) = 4 for- points b in a neighborhood 0 (4) C U(4). We can now define the 
functions 


Yo(b) = ¢4(b(b), b), b c H(A), 
such that (4) = d. Then 


(18) a Poe (Bey (WBS (b))(Be.)(6)A‘(0)) + AZ, (WAT (6) 


= [Af (¥) — BY,(Y)] AL’). 


Thus (0) is a solution of (18) such that ¥(4) = &. Buty = dé isasolution of 
these equations possessing the same initial values. Hence 


0(b(b), b) = de, b c O(4). 


In other words if b C 0 (4) there exists a point 6 C U (4) such that (16) is satis- 
fied. 

For z, b C U(4) the equation z-b = d has at most one solution z = 6. This 
can be shown briefly in the following manner, use being made of the symbolic 
notation. Suppose b’ C U(a) is a second element inverse to b, then 6-b = 
b’.b = 4. Hence b-c and 6’-c are analytic solutions of the equations 


0 ¢ 
set = Atn(@) ALC) 


both of which become 4 for c = b. Hence b-c = b’-c and taking c = dé we have 
b= b’. 

To prove also that 
(19) bb=4; or 


(19’) dab, b(b)) _ é.., 

















804 W. MAYER AND T. Y. THOMAS 


if b-b = & where b, b C (4) we observe that 


b-(5-b) = b, (b, 6 < U(4)) 
on account of (16). Hence 
(20) (b-b)-b = 5, (b,6 C U(4)), ~— or 
(20’) a(b-b, b) = ba. 
That is ¢.(b-6, c) is that solution of 
(21) set = Ag (ASC) 


which, for c = b, assumes the value ba. But ¢.(4, c) is a solution of (21) with 
the same property. Hence 


ba(b-b, c) = ba(d, ¢); 


taking c = d the equation (19) results. 

Since 6-b = b-b = d, it follows that b is also the element inverse to b. 

On account of the continuity (analyticity) of the function 6(b) and the fact 
that 6(4) = 4 we can find a neighborhood U’(é) C (4) such that the point set 
T formed from the elements inverse to the elements of U’(4) likewise lies in 
(4). Since the inverse function b(b) is continuous (analytic) in 0 (a) it follows 
that T is a neighborhood U’(é) homeomorphic to U’(4). Hence the neighbor- 
hood % of 4 defined by 


{ = U'@ + 0) 
is such that % C 0(4) and any element b C possesses an inverse 6 C I. 
The above neighborhood Y% consists of points with the following properties: 


(a) Ifa,b Cc I the product a-b is defined." 
(8) If a, b,c CW the products (a-b)-c and a-(b-c) exist and are such that 


(a-b)-c = a-(b-c). 
(y) There exists an element 4 C H such that 
ad=da=a 


for any element a C I. 
(5) If a CH there exists an element d CX such that 


a-G = G-a = G. 
The neighborhood possessing the above properties (a) --- (6) will be called 





“ The product a-b need not be a point of {. However a neighborhood %(4) C % exists 
such that if a, b C %(d) then a-b CH. 





THEORY OF LIE GROUPS 805 


a group germ;'> this neighborhood will be a group germ as defined in §9 if we 
can show that it is a neighborhood of the identity element of a Lie group. 


17. Extension of the group germ. We can now extend the germ WH toa group 
¥ in which the only defined relations are simple product relations (see §9). In 
consequence of the simple product relations it may occur that two points a and b 
of the germ % with different coordinates become identical as points of the 
group %f. Such identifications were impossible in our previous discussion (§10) 
since the group germ was then taken from an underlying group. By the use 
of representations of the germ % we shall now establish conditions sufficient 
for the non-existence of such identifications in a neighborhood of the identity 
element A of Y. 

If in any neighborhood U (A) of % there would exist two identical points 
A and B with different coordinates we could find a sequence C, — A such that 
C,, # Cm for n # m where all points C, are identical with A. As considered 
in §13 the sequence C,, determines accumulation directions {in d. Now let a(a) 
be a representation of the germ 9% and extend this representation to a matrix 
group WM (see §11). Then to each element A C Y% there corresponds just one 
matrix of I defined by the correspondence 


_ A = (iz --- &) — a(a)a(ar) --- ala), 


where d, dz, +-+,@ © Hf. In the above points C, the matrices of Jt are equal 
to the identity matrix. Hence for an accumulation direction we have 
(1) MEY = 0, 


where 1, are the characteristic matrices of the representation. It follows there- 
fore that if there exists a locally true representation of a neighborhood U(dé) C i 
or if there exists a representation such that for a given (but arbitrary) direction 
the above relations (1) are not satisfied then there exists a neighborhood U (A) 
of Yin which there are no identifications. By means of this neighborhood U (A) 
we can define coordinate neighborhoods for each point of 2 such that 2 becomes 
a Lie group (see §10). In a neighborhood U(A) of the identical element the 
fundamental differential equations of the group & will be identical with the given 
equations (1) of §15. 

The above considerations give the proof of what is called the first (and second) 
fundamental theorem of Lie for ordinary continuous groups. It is to be observed 
however, that this theorem has been proved only when there exists for a given 
but arbitrary direction £ a solution ti) of the equations (3) in §8 for which 
Na) é is different from zéro.® As a further consequence we have, under the above 





* Tt is clear that we can always restrict the neighborhood % in such a way that the 
product of any number m(< M) elements, where M is an arbitrary but fixed integer, will 
exist. Corresponding to the above property (8) we can then show that the associative law 
will hold for any m elements. 

'® In particular this assumption is satisfied if c\4>.) &’ = 0 has £7 = 0 as the only solu- 
tion. The assumption is also satisfied for the (Abelian) case ct oy) = 0. In fact any r 
constants ni) gives a solution of (3) in §8. We have therefore only to choose the con- 
stants nq) such that n,.)¢ # 0 where £ denotes the given direction. 


. 

















806 W. MAYER AND T. Y. THOMAS 


assumption, the third fundamental theorem of Lie: Given any set of (real) con- 
stants C\$) satisfying the conditions (5) and (5’) of §5 there exists a Lie group 
having the C{%, as its constants of composition. In fact we can determine a set of 
analytic functions A (,) which satisfy a system of differential equations of the form 
(3’) in §5 involving the given constants C. But these equations are the integra- 
bility conditions of the system (12) of §4. Hence these latter equations are com- 
pletely integrable and we can therefore construct the group germ % which can 


then be extended to a Lie group.” 


VI. Subgroups 


18. Subgroups of R,,-coordinate groups (m = 2).% Let & be an R,,-coordi- 
nate group and Y%; a subgroup of %f. Introduce in %, the neighborhoods 


Wp) = WN U(p), 
U(p) = connected part of Y(p) containing p, 


where U(p) is any neighborhood of p C %. With respect to the neighborhoods 
Y(p) or U(p) the subgroup %, is a Hausdorff space (see §1). 

In the following only subgroups Y%, are considered which satisfy the properties 
(a), (8) and (y): 

(a). To a point p C A; there corresponds at least one coordinate neighborhood 
U(p) of & such that the corresponding neighborhood U(p) is a coordinate neighbor- 
hood for %, i.e. U(p) ts homeomorphic to a neighborhood of a point of the p di- 
mensional number space, 0 < p < r. 

Then a point p’ C U(p) with coordinates pj, --- , p, has in U(p) the coordi- 
nates a(p’), --- , a,(p’) and 


(1) Qa = Aa(Pi, +++» Py) a=l,---,?7, 


is the parametric representation of the points of % in U(p). 

(8). The above functions a,(p’) are continuous and have continuous first and 
second derivatives. 

We can prove that (a) and (8) hold for any point of %, if they hold for a 
particular point p of %. Let q be any other point of 9%; then f C %; exists such 
that g = f-p. Also 


(2) U(q) = f-U(p) 
is a coordinate neighborhood of q if U(p) is a coordinate neighborhood of p(§2). 





17 KE. Cartan in his Théorie des Groupes Finis et Continus et l’Analysis Situs, Mémorial 
des Sciences Mathématiques, 1930, pp. 17-19, has given an outline of a construction of the 
adjoint group in the case that cis, £’ = 0 has £7? = 0 as the only solution, but in his 
remarks concerning the general case he has apparently not considered the question of 
identifications in the group germ. 

18 The case m = 1 necessitates only a few changes concerning the order of certain deriva- 
tives in the above discussion. 





THEORY OF LIE GROUPS 807 


Hence 

(3) %, A U(g) = f-1%h N U(p)) 
since a point f-p’ belongs to %, if and only if p’ C %M, ice. 
(3’) Wig) = f-Wlp); _ hence 

(4) Ug) = f-U(p). 


It follows from (2) and (4) that the parametric representation (1) of points of 
%, in U(p) becomes the parametric representation of points of % in U(q) if we 
introduce in U(g) and U(q) the coordinates of the corresponding points of U(p) 
and U(p) respectively. 

(y). The matrix 
aa 


ap, 


’ a=1,---,r;os1,---,0, 


(5) 














is of rank p for any point p © %M. 

We now show that for two overlapping neighborhoods U(p) and U(q) the co- 
ordinate transformation functions are continuous and have continuous first and 
second derivatives. Let A = U(p) M UW(qg) anda’ CA. Then representations 
(6) a., — a,(pi; ee »P,)s a, — an(q, _—* >) . 
exist for points of A. But for the corresponding neighborhoods in & the points 
of U(p) M U(q) have coordinates a’ and a’’ such that 
(7) a, _ a,(ai, TS a;), a, _ an(a;, ‘a , a) 
these functions being continuous and having continuous derivatives to the 
order m = 2. Since || aa,/ap, || has the rank p we can assume that the deter- 
minant | da,/ap, | ~ 0 for a, o = 1, --- , p and hence we can solve the first set 
of equations (6) to obtain 


(8) p, = p,(a,,---,@,), o=l,---,9 


these functions being continuous with continuous derivatives to the order 2. 
From (8), the first set of equations (7) and the second set of equations (6) we 
now have p, = p,(q1, --- ,q,) where the functions # are continuous and possess 
continuous first and second derivatives. 

We have now proved that %; is a 2-coordinate space with respect to the neigh- 
borhoods U(p). We show next that %, is an R2-coordinate group (Lie group). 
Let p, g, r be three points of %{, such that p-q = r and denote by U(p), U(q), 
U(r) and U(p), U(¢g), U(r) the corresponding neighborhoods in %f and Y%, respec- 
tively. Then the relations p’-q’ = r’ for p’ CU(p), g’ C Ug), r’ C U(r) can be 
described in YI, by 


(9) ¥.(p', 9’) = Fee go =1,---4p, 

















808 W. MAYER AND T. Y. THOMAS 


and in %& by 
(10) ga(a(p’), a(q’)) = aa(r’), a=l,--.,r, 


where y and ¢ are the composition functions in %f; and % respectively. From 
(9) and (10) we have 


(J 1) ga(a(p’), a(q’)) —_ aa(y(p’, q’)). 


Since the rank of the matrix || 4a,/ar; || is p we can assume that | da,/ar/ | #0 
for a,o = 1, --- ,p and hence solve (10) to obtain 


r, = ¥.(p', @’) = xelv(a(p’), a(q’))I, 


the functions x being continuous with continuous first and second derivatives 
with respect to ¢. But ¢(a, b) is continuous in the b and has continuous deriva- 
tives in these coordinates to the order m = 2; hence y,(p, g) is continuous and 
has continuous first and second derivatives with respect to g. This completes 
the proof of the above statement. 


19. Differential equations of the subgroup. Denote by p the identity ele- 
ment & C % when this element is considered as an element of the Lie subgroup 
%,. Als@let A’ denote the tensors A defined in §4 when these tensors are con- 
sidered with reference to the subgroup %. By differentiation of (11) in §17 
we obtain 


(1) Ava(b,c) das(q) _ Aaa) ay,(p, g) 








0Cg 0q- dy; 0Ge 


where b = a(p), c = a(q) and the primes have been dropped. Now put ¢g = p 
in (1); then c = 4, Y = p, ¢ = b = a(p) and the equations (1) become 


: daa(p) 4's 
(2) Aga (ap) Mit, = “eX? ain), 
where 
M®, = (<2) ; 
(o) ae oud 


The constants M‘?), can be regarded as the components of p vectors M) de- 
fining the tangent space to the subgroup %, at the identical point 4. By (7) of 
§17 the rank of the matrix || M‘??, || isp. Multiplying (2) by A‘{’)(p) we have 


(3) 2a) = Ags (a(p))MEP,A“(p) 


as the differential equations of the subgroup %. 





THEORY OF LIE GROUPS 809 


The system (3) is completely integrable. Calculation of the conditions of 
integrability leads to the equations 








[Ale ae) — 28) ais, |S? acre ase, 


0a, 
(4) E= aAi\#) 
+ ODe a Op, 





Jats acsaee (x) = 0. 


If we denote by C3) and C({*} the constants of composition of the groups % 
and 9%; respectively the equations (4) can be written 


a C/S ) a) ° 
(4’) C19) MP) MY) = Ch) Mh} ; a,B,y = 1, r* 0, §,9,f= 1, Pied: 


Now let C{$),) be the constants of composition of a Lie group % and suppose 
that the equations (4’) mamntieiong mee constants C{$) admit a solution in the 
(constant) quantities M‘?)) and Cs '} where the matrix || M.{#} || has the rank p. 
It then follows readily that the C, ne) are also the constants of composition of a 


Lie group, i.e. 
Cas = —Ciey 
CSCS + COD CES + CUB CGS = 
As shown in §5 we can now find analytic quantities A’ defined in a neighborhood 
U(p) of a point p of the p dimensional number space such that 


aA/) aAi() 
( - Ah AG) = Cbs} 


(5) a rs 





where 
A,(p) = 6 


and the matrix || A/“)(p) || is inverse to the matrix || A(,5(p) ||. With these 
functions A’ the equations (3) are completely integrable and hence admit a solu- 
tion a_(p) defined in a neighborhood U(j) C U(p) such that a.(p) = 4. and 
a.(p) © U(4) in which functions A in (3) are defined. 

Now consider the completely integrable system 


Ore 


Tt 


(6) = A(j(r)A,(p), g,7,4=1,---p 





where the A’ are defined in U(p). Then there exists a neighborhood O(p) such 
that (6) admits a unique analytic solution r, = ¥.(q, p) for g, p © 0 (p) with 
the initial values ¥.(q, ») = de where r C O(%). Hence for p, g © 0 (p) both 
members of the equations 


(7) va(a(q),a(p)) = aalv(q, p)) 














810 W. MAYER AND T. Y. THOMAS 


are defined and are equal since each member is a solution of (3) with the same 
initial values a.(q) for p = p. Indeed 





deel OO) = Ag leAP (Alp) ATs (a(R) MSE A 1%) 


ala = Afs)(y)M!?))A}(p), 


fad ” Afs)(a)M'P,A OWA WA,(p) 


= Afs)(a)M‘?),A +" (p). 


it We can choose the neighborhood O(p) so that it contains the elements inverse 
| to each of its elements (see §16). Then U(p) becomes a subgroup germ 9%). 
We have thus established the existence of a subgroup germ I, having a tan- 
gent space at dé determined by the solution M‘f?, of the equations (4’) and such 


that in 9, the differential equations (3) are satisfied. 


20. Extension of the subgroup germ. At each point a of the underlying 
Lie group % there is defined by the p independent vectors 


(1) f(a) = Afg)(a)M.{8} 


a p dimensional vector space. These vector spaces are integrable. By this is 
meant that there exists a p dimensional surface F, passing through the given 
point a and having as tangent spaces the vector spaces defined by (1). In- 
deed by integrating the completely integrable system 


(2) ot = At (a) Mf), AL%(p) 

Pr 
with the initial values a,(p) = a, we get a parametric representation of this p 
dimensional surface in a neighborhood of the given point a. Denote this sur- 
face element by E,(a). Also denote by S,(a) the p dimensional surface obtained 
by extending the element E,(a). Obviously 





(3) S,(a) = S,(b), if S,(a) Db. 
We have 
(4) b-E,(a) = E,(b-a); 


in fact if a,(p., --- , p,) is a solution of (2) then b-a(p) = c(p) is also a solution 
of (2). Indeed 


Ola 


~~ Af, (c)A$(a) AP. (a) MS), 4 ,“”'(p) 


= Af) (M24, (9). 





THEORY OF LIE GROUPS 811 


In consequence of (4) it follows that if 
(5) f CS,(@), c C E,(4) then f-c CS,(a); 


in fact 
f-E,(4) = E,(f) CS,(f) = S,(a). 


Now take E = E,(4) to be a subgroup germ (§18) and construct the group 
%, C % whose elements are defined as products of a finite number of points of 
k. Then 


(6) i= S,(d). 


To prove this we show first that % C S,(4). By definition an element of %; 
is defined by 


(7) a’ = Q\-Qz: +: "Ag Agyis +++ +A 
where 
a, cee ,a, CE, 


Now a; C S,(4). Assume that a). --- -a, © S,(4). Since a, C E it follows 
from (5) that a; ---+ @,-@s4: C S,(a). Hence by induction a’ C S,(4), ie. % C 
S,(4). Now take a C S,(é). Then there exists a continuous curve C in S,(d) 
connecting d to a. In C we define a class division. A point b of C will be 
said to belong to the class A if b and all points of C between 4 and b belong 
to %. The class A is not empty since it contains dé. All other points of C 
belong to the class B. By this class division there is defined a point e which 
is either a last point of A or a first point of B. 

If the class B is not empty we shall show that we have a contradiction. First 
suppose that e is a last point of A. Consider 


(8) E,(e) = ¢-E,(4) = e-E 


where of course E,(e) C S,(d). Then there exists a point f C E,(e), the point f 
being a point of C between e and a and not belonging to %. By the corre- 
spondence (8) we have f = e-a’ where a’ C BE. Hence f is a point of %, giving 
a contradiction. Second suppose that e is the first point of B. Then e is not 
a point of %, but all points of C between 4 pa e are points of %. Hence 
there exists a point g C %, such that g = ’’ by the correspondence (8) 
where a!’ © E. Bute = g-a’ which shows that e C %; since @’’ is contained in 
E by the property of the group germ and we thus have again a contradiction. 
Hence (6) is proved. 

We wish to prove now that the properties (a), (8), (y) of §17 are satisfied for 
the subgroup %. The group %, has in no point a dimension greater than p. This 
follows from the fact that 9, can be covered by a countable set of p dimensional 
elements E,. Let aa(pi, --+ » Pp) = Ga be the parametric representation of the 





* K. Menger, Dimensionstheorie, Teubner 1928, p. 92, and O. Schreier, loc. cit. 

















812 W. MAYER AND T. Y. THOMAS 


subgroup germ E C XY the functions a,(p) being defined in a certain neighbor- 
hood U;(p). Then there exists a neighborhood U(p) C U1(p) homeomorphiec 
to its image a(p) in the group space %. Denote this image by £, which of 
course lines in E. Because of this homeomorphism there corresponds to a 
neighborhood U’(p) C U(p) a neighborhood U(dé) € % such that 


Image of U(é) N B, C U'(H), or 
U(é) N B, C image of U’(p). 
In particular 
[U(a) N £,] C image of U’(p), 


where the bracket denotes the part of the intersection connected with 4d. Now 
take U’(p) so that its enclosure is contained in U(p); then [U(4) N &,] is the 
totality of all points of U(d) which can be connected to & by curves in U(d) 
possessing the property that their tangents lie in the vector spaces defined 
by (1). We can now prove that 


(9) [U(a) N B,] = (U@ N S,(4)]. 


It is evident that each point of the left member is a point of the right member. 
Suppose then that b is a point of the right member not contained in the left 
member. The point b can therefore be connected to d by a curve C of 
[U(a) N S,(4)] at some point e of which the tangent vector will not lie in one of 
the above vector spaces (1). For any point e’ C E,(e) there exists a point 
f © S,(4@) such that e’ = f-e; then the curve f-C will be a curve of S,(d) pass- 
ing through e’. Because such curves f-C pass through each point of a certain 
neighborhood U(e) C E,(e) it follows that S,(dé) is of dimensionality p + 1 at 
least at the point e in contradiction with the fact that the dimensionality is p 
at each point of S,(4). 

In consequence of (9) there exists a neighborhood U() of the identity element 
of the group %, satisfying the properties (a), (8) and (y) of §17 which as we 
have seen has as a result that these properties are satisfied in each point of 2. 
We have now established the following result: If % is a Lie group with constants 
of composition C{§),) and if %, is a p dimensional subgroup of U satisfying the prop- 
erties (a), (8), (y) and having the constants of composition Cc £3 and whose tangent 
space at a is defined by the matrix || M‘f?, || then the equation (4’) is satisfied by 
these quantities. Conversely if (4’) admits a solution M‘??, and ccf} the rank of 
the matrix || M‘?) || being p then there exists a p dimensional subgroup satisfying the 
properties (a), (8), (y) having C( £5} as constants of composition and whose tangent 
space at a is given by the matrix || M‘f?, ||. The problem of finding all subgroups 
of a given Lie group with properties (a), (8), (y) is therefore reduced to an 
algebraic problem. 

It follows evidently from (4) that 


S,(a) = a-S(4). 





THEORY OF LIE GROUPS 813 


Hence S,(a) gives the cosets in the decomposition of the group % with regard 
to the subgroup S,(4), i.e. 


WY = S,(é) + a-S,(4) + - 


21. Representations as subgroups of a matrix group. Because of the 
fact that one can consider a representation of a given Lie group % as a subgroup 
of the matrix group Mt consisting of the totality of all matrices with non-vanish- 
ing determinants and of order equal to the dimension n of this representation, 
one can derive the whole theory of representations from the theory of sub- 
groups of matrix groups J?,. Of course the necessary and sufficient conditions 
one gets will be the same as derived in §8. 

The elements |] @ag || of Mt, are in (1, 1) correspondence with the points 


(ay - ++ Gini +++ Gon +++ Ant +++ Ann) 


of an n? dimensional (real or complex) number space exclusive of the hyper- 


surface | dag | = 0 in this space. The law of composition a-b = c in M, is de- 
fined by 
(1) Cap = Aaybng. 


Since the law of composition (1) is analytic the group M, is a Lie group (A-coor- 
dinate group). Let us now calculate the constants of composition of this group. 
We obtain first from (1) by differentiation 








8Cas 
= aod iT. 

— 

Hence the A tensor is defined by 
Cag 

2 = 664ee; and 
( ) A(e r(a) = (=), . 
(2’) A'*P)(a) = dssdey 


where the @,, denote the elements inverse to the elements dag. By definition the 
constants of composition are now given by : 


aAli” aA‘in) ' 
3 clken 8 ( ys Mes) ae? Az? 
( ) (pv,or) Ades days (uw v)4*(or) 





Substitution of (2) and (2’) into (3) gives 

(4) ChE.) = Seedy Our — Sey Dye Dep « 
For a given subgroup NM’ of the Lie group Mt, we have 

(5) Chips MEP MG) = MEP CES, 


where the M{¢?) defines the characteristic tangent space to I’ at the identical 














Wns | 


814 W. MAYER AND T. Y. THOMAS 


element and the C(;!} are the constants of composition of the group 2’. Intro- 
ducing (4) in (5) we obtain 


— ee (e3) qg(de) (es a(t) 
(6) MEP MES? — ME Mey? + Mey’ Coes} = 0 


+a as the equations characterizing the subgroup Mt’. Since 


d43(a) ” 
(Zac) = Miep 
ar a=d 


we see that || M{¢?? || is the characteristic matrix of the representation I’ of the 
| i bung group % described by the parameters a;, --- ,a@,. Equations (6) are identical 
Ha with the previously found equations (3) of §6 which characterize the representa- 
tions of the class of Lie groups corresponding to the constants of composition C’. 


22. Lie algebra. By the result of §19 we can say that a subgroup %, of the 
Lie group % is characterized by the vectors defined in Y& by 


(1) Af;)(a) = Afs)(a) MS?) . 


We shall accordingly call these vectors the characteristic vectors of the subgroup. 
The necessary and sufficient conditions (4’) of §18 can now be written by means 
of the above vectors A. If we replace the C{4),) in these latter equations by their 


expressions in terms of the A’s we obtain 








” OAla) 4< OA(s) 4 (a) ay (8) "eda (vy) 48 
(2) Fa, Atay he A(ay ) Mg) M5) = Coe) Mh) AG) 5 
or 
gy y8 
(3) (Mees Ne)? = CcehsAcey » 
where 
0A(s) ,. Als), 
(Ag, A@)> = = _ — A(¢)- 


The quantity in the left member of (3) is usually called the commutator of the 
two vectors Ag, and A,,). Conversely by starting from (3) in which the A are 
given by (1) we can go back to the equations (4’) of §18. Hence the necessary 
and sufficient conditions for the subgroup %, are given by the equations (3) in 
erms of the characteristic spaces of the subgroup. 

The problem of determining all subgroups %, of a given Lie group Y% can be 
formulated as a problem of the so called Lie algebra. By a Lie algebra of dimen- 
aes sion r is meant an r dimensional vector space for which, in addition to the sum 
oni of two vectors and the product of a vector by a constant, there is defined the 
product (xy) of two vectors x and y, the product being a vector of the space, 





such that 

(4) (zy) = —(yz), (x(yz)) + (y(ex)) + @(ry)) = 9, 
(4’) @+y,u+) = (xu) + (ev) + (yu) + (yr), 

(4’’) (cx, y) = c(zy), 





THEORY OF LIE GROUPS 815 


where c is a constant. The fact that the dimensionality is r means that there 
exist r, and only r, independent vectors A, of the space. As a consequence of 
(4’) and (4’’) we know the law of multiplication for any two vectors of the 
Lie algebra if we know it for some basis A, for which we may have 


(5) (A, Ag) = CipdAy,- 


It follows from the first equation (4) that the constants C2, are skew-symmetric 
in the indices a and B. Also from the second equation (4) we have 


apC yy + Cby Con + CraC5, = 0. 


These constants C'%, which characterize the Lie algebra thus satisfy the same 
condition as the constants of composition of a Lie group.” 

A p(< r) dimensional subalgebra of the Lie algebra is defined by a set of p 
independent vectors 


(6) A; = A,Mj, f=1,---pyel,---r, 
such that 
(7) (Ag, A,) = Ce Ay, 





where of course the rank of the matrix || Mj || is p. If we introduce in (7) the 
expression (6) for the A; we deduce immediately the equations 


(8) Cz,M¢M* =C,'M}. 





20 The vector fields 
=(a) = A(g)(a) 


where the \’s are arbitrary constants can be considered as the elements of a Lie algebra 
of dimension r having as a basis the r vector fields A(s)(a), the product of two elements 
= and T being defined by 


(= v)@ _ dz rT oT* 
— Oe 


€ 





hy 


0a, 


The characteristic matrices 11(,) of a representation of a Lie group may also be taken as 
the elements of a basis of a Lie algebra. Each element 0 of the algebra is then given by 


l= Ny) g”, 


where the é’s are arbitrary constants and the product (1, m) of two elements 1 and m is 
defined by 


(n,m) = mn— nm, 


where mm and 11m represent ordinary matrix products. A Lie algebra whose elements are 
matrices with the above definition of product is said to be a representation of another Lie 
algebra if the two algebras have the same constants of composition for a suitably selected 
basis. 

















816 W. MAYER AND T. Y. THOMAS 


Conversely we can deduce from (8) by multiplication by A, and use of (5) and 
(6) that (7) is satisfied. Hence the problem of determining a subalgebra of a 
given Lie algebra is identical with the problem of determining a subgroup Y%,, 
satisfying the properties (a), (8), (vy) of §17 of a given Lie group Y. 
VII. Invariant subgroups 
23. Conditions for a subgroup to be invariant. Let 2, be a subgroup of a 
Lie group % defined in Y& by aa = Ga (p1,--+,p,). By the point transformation 
(1) a& = b-a-b, (b fixed), 


in % the subgroup Y, will become a subgroup % = 6-%1-6. By definition the sub- 
group %, is invariant if for each b C %& we have %, = %;. Corresponding to the 
transformation (1) we have in the identical point 4 the transformation of con- 
travariant vectors \ given by 


(2) N® = H')(b) x) 


(see §7). Hence if M‘t}, and M‘¢?) define the characteristic vector spaces of 
the group %, and %, in the identical point we have 


(3) MQ}, = HS} (0) MC), 


~ 0Ga(p) 844(p) 
M‘?), = (S@!) M‘?), = (se) . 
( é ) Op: p=p , ( f ) Ope p=p 


For an invariant subgroup %, the vector spaces determined by M and M are 
the same and hence 


where 


(4) MM}, = MS), L%2,() . 
From (8) and (4) we have 
(5) H‘¢3)(b) M8?) = MQ), L2?,(6) 


as the necessary and sufficient conditions for the subgroup 2, to be invariant. 
The sufficiency part of this condition follows from the fact that (5) implies (4) 
which means that 9%; and %, have the same tangent space in the identical point; 
since this tangent space completely determines the subgroup we have 2%, = %, 
On account of (5) the adjoint group H is reducible (see §14). Hence by the re- 
sult of §14 it follows that 


(6) Cay Mi, = MON), 


where C and N denote the characteristic matrices of the representations H and L 
respectively (§8, (5)). Since (5) and (6) are equivalent ($14) the algebraic condi- 
tions (6) are necessary and sufficient for the subgroup A, to be an invariant subgroup. 





THEORY OF LIE GROUPS 817 


24. Conditions for an invariant subgroup. Suppose that the constants M‘? ee) 
and Ni?) satisfy (6) of §23 in which the C{? By) are see constants of composition 
of a given Lie group, the rank of the matrix || M‘)) || being p. Multiplying 


these equations by M‘??, we obtain 
(y (B ! 
(1) CUS) Me) MER?) = MP), Cf8, 


where 


By the result of §19 there exists a p dimensional subgroup %; of & having in the 
identical point the tangent space defined by the matrix |] M‘f?) ||. From §23 it 
follows that the subgroup %, is invariant. Hence, the problem of finding all 
invariant subgroups A, of a given Lie poy Y zs reduced to the algebraic problem 
of finding all solutions M‘f}) and N\"., of the equations (6) of §23. This result 
can be expressed geometrically by saying the determination of all invariant sub- 
groups is equivalent to the determination of all vector spaces M‘??) invariant 
under the set of r transformations 


i = Cis) rn , Y= 1, ome 


By introducing into (6) of §23 the expressions for the C{3, given by (4) of §5 
and multiplying by A‘,,) we obtain the equivalent relations 


(2) (Ac), Aq) = Aq NEY), 


where 


OAly) ,3 _ OA(e) 
(Aw), Aw) = ~~ — Aly); 


(a) 
Aq = A(a)M"(). 


By definition (2) defines an invariant subalgebra of a Lie algebra (§21). The 
problem of finding all invariant subgroup groups of a Lie group is therefore 
equivalent to the problem of finding all invariant subalgebras of a Lie algebra. 


25. Factor groups. Let % be a connected Lie group and Y%, a connected 
invariant subgroup (not necessarily a Lie group) of Y%. We can define neigh- 
borhoods in the factor group %/%, such that this group will be a topological 
group possessing the properties of simultaneous left and right continuity and 
continuity of the inverse elements, this being possible if, and only if, the sub- 
group %; is closed in A. Denoting by A, B, C, --- the elements of the factor 
group %/%M, take any point a of the coset of 9 corresponding to A and let U(a) 
be any neighborhood of a; as a neighborhood U(A) we take the set of all ele- 
ments A’ corresponding to elements a’ C U(a). For the proof of the above 
statement on the basis of this definition of neighborhood see O. Schreier (loc. 
cit.) or B. L. van der Waerden, Vorlesungen iiber kontinuierliche Gruppen, 
Gottingen, 1929. 


: 
| 
: 





ut 














818 W. MAYER AND T. Y. THOMAS 


Now let 2%, be a p dimensional Lie group. It is then possible to choose coordi- 
nates pi, --- , Py in a certain neighborhood of a such that the co-sets of X defined 
by %, have the equations pi; = const., --+- , p, = const. 

Selecting coordinates a, --- , a,in Wand pi, --- , p, in %; so that these groups 
will be A-coordinate groups the parametric representation aa(p1, --- , p,) of 
where a.(p) = da is analytic on account of the equations (3) of §18. The point 
c = a-b of the co-set %,-b will have coordinates given by 
(1) Ca = ¢a(a(p), ). 

Now take the bg in (1) to be analytic functions of a set of r — p parameters 
Po+ty *** » Pr Such that bg(p) = dg. Then the c, in (1) are analytic functions of 
the parameters p:,--- , pr. We shall now show that we can choose the above 


functions bg(p) such that the equations (1) define an analytic coordinate trans- 
formation c — p in a certain neighborhood of d. Assume that 


00a " 
(22) = os, a=l1,---rrea=l..--p, 
since this can be accomplished by a linear transformation of the coordinates 
a. in a neighborhood of d. It then follows readily that 


(2), = (**) AW. SateiBniue 
ODe/ p ODe/ dp 














dba , 
= (*), ... ifo=p+l1,---,r. 
Hence the functional determinant of (1) in the point @ is 
. Oe cnn abi pais abi 
OPp+1 opr 
01 --- O 
ab ab 
0 0 0 er 
OPp+1 OP, 
kK ae BD o+1 bie Abo 
OPp+1 OD, 
0 0 S.A: 
OP p+1 Op, 











Now choose 


9° 9 
b = hh, ---,b, = ,, Dott = Dotty +++ » by = Pr. 





THEORY OF LIE GROUPS 819 


Then the above determinant is equal to 1 and hence the equations (1) admit a 
unique analytic inverse in a certain neighborhood of 4. In the p coordinate 
system the equations 


Poi = Const., --- , p, = const. 


define the co-sets of 2. 

Now let Ya(p, g), a = 1, --- , 7, be the (analytic) connection functions in the 
p coordinate system defined in U(@). Then ya(p, q) where a = p+ 1,---,7 
is independent of the first p coordinates pi, --- ,p, and q, --- , q, of the points p 
and q; this is a consequence of the fact that the co-set ¥,.1:, ---, ¥, is uniquely 
determined by the co-sets pps, --- , pr and g,41, --- ,q,. Hence the analytic 
functions 


Yo41(Pp+ty *** 5 Dry Uptt, *** »Qr), es ¥r(Pp+1, *** 9 Dry Uptty ° °° » qr), 


are the connection functions of the factor group %/%; defined in a neighborhood 
0 (p) of the identical point (41, --- , ~,) of thisgroup. It may happen that in 
each neighborhood of there will be two elements p and q having different coordi- 
nates but equal as elements of the group %/%. There will then exist a se- 
quence p, — p, where the coordinates pr, ~ Pma if n * m, such that each p, is the 
identical point p. In this case there exists no neighborhood of p homeomorphic 
to a neighborhood of the (r — p)-dimensional number space and hence the fac- 
tor group %/%, cannot be a Lie group. In the contrary case there exists a 
neighborhood 0 (p) < O(p) such that in o (p) there is no identification of points 
with different coordinates as points of the group %/%1. We can then take a group 
germ (§9) contained in 0 (p) and by means of this germ (§10) we can define co- 
ordinate neighborhoods throughout the group %/%; such that on the basis of 
these coordinate neighborhoods 4/2, will become a Lie group (A-coordinate 
group). 

Denote by use of ~ quantities associated with the factor group %/%, as func- 
tions of the coordinates p,i1,--- , pr. Then 


WalDp4ty *** Dry Qptty °° ° Gr) = WalDy *** Dry Gr °°: GQ), = Ptl-esyr. 
From these relations one has immediately the following relations 
Afs) = A€s), Ay? = AS”, a,B=p+l,---,r 
Al) = 0, A =0, ie FED, ee Ys op. 
It follows from these latter relations and the equations 
of”, (me _ aA 
OP. Op, 





) AtoAisy a,B,y=ptl,---,r34,v=1,---,7 


that 


Cir), = C{?),,, Qa, B, Y=prt 1, oo Tw 














820 W. MAYER AND T. Y. THOMAS 


Now consider the problem of Lie algebra analogous to the problem of finding 
the factor group of a Lie group. Let 2 be a Lie algebra of dimension r having 
the basis elements 


Ai, er »A,, Api, -°> , A, 


where the first p elements form a basis of an invariant subalgebra %. Then 


(2) (Ay, A,) = ct rA:, BM; vi é = 1, eee 7 
and 
(3) (A,, A,) = Ci,A,- o,twl,--> ,pjy=xl,---,r, 


these latter conditions expressing the fact that % is an invariant subalgebra. 
We can now define a new Lie algebra out of the elements A, B, C, --- of & by 
imposing an equality relation between these elements, the law of addition and 
multiplication in % being unchanged. Using the sign = to denote this new 
equality we shall say that A = 0 if A C %, and that A = Bif A — B=0. 
We have only to prove that any relation between elements of the Lie algebra 
remains unchanged (mod. %%,) if any element in this relation is replaced by an 
equal element (mod. 21). Since any such relation is composed of the two 
operations A + B and (AB) this will be proved if it is shown that A + B and 
(AB) remain unchanged (mod. %;) when A and B are replaced by an equal 
element (mod. %). But this fact is obvious for the sum A + B. Also 
(A, B — B’) = Oif B — B’ = 0 as follows from (3). Hence (AB) = (AB’) if 
B = B’;similarly (AB) = (A’B) if A = A’ and hence (AB) = (A’B’)if A = A’, 
B = B’. 

The above modular algebra is called the factor algebra of 2 with respect to %, 
and is denoted by the symbol %/%. It is evident that the elements 


A,s1, :+: , A, form a basis of the factor algebra which is therefore r — p dimen- 
sional. Also from (2) we have 
(A,, Ag) = Ci sA,, a,B,y=ptl1,-::-7, 


which shows that the constants of composition of the factor algebra %/%, are 
identical with the constants of composition of the corresponding factor group 
%/%;, of a Lie group A. 


26. Two subgroups whose elements are commutable. Let %, and %: 
be two connected Lie subgroups of a Lie group % having respectively the para- 
metric representations 


ae = fa(pr, fers Pm); da = Jalq, er Qn). 


We shall derive necessary and sufficient conditions for the commutivity of the 
elements f(p) and g(q) of these subgroups. 


(1) va(f(p), 9(9)) = ¢a(g(9), f(p)). 





2 ECARD AEP IRR Go 0 





THEORY OF LIE GROUPS 821 


By differentiating (1) with respect to g, and using the equations of the sub- 


groups: 


(2) A fs) (9) M°P2)4 +? (q) 


Oa _ 

00; 
Ofe Al 

(2') a = As) (SNP) Ae” (p) 


we obtain 


(3) Ay MAP’ @AY?@) MAG) = Bey @BYO) Ab) @) M24 1'@ 


which is equivalent to 


(4) At, (MR, = Be) (~)H%)@)M?). 
Multiplying (4) by BS°’(¢) gives 
(5) H,(@)M@, = H'2)Q)M%). 


The equations (5) in which ¢ denotes either y(f, g) or ¢(g, f) are necessary 
conditions for (1). Now assume that (5) holds for one of the above functions ¢, 
for example ¢ = ¢(g, f). Then also (8) is satisfied for ¢ = ¢(g, f). Now put 


(6) Ye = gal f, g) es ¢a(Y, f). 


Then by differentiation we obtain 


We — [Ag lh, @) — Agr, f))] MPL) 


= [AZ (Cf, 9)) — At@(f, 9) — ¥)) M2)41°@ 


by the substitution (6). Now ~ = 0 is a solution of (7); also the solution (6) 
is zero for g = g. Hence (1) is satisfied. The conditions (5) for g = ¢(g, f) 
of = ¢(f, g) are therefore necessary and sufficient for the commutativity of 
the elements of the two subgroups %; and Ye. 

We shall now show that (5) can be replaced by an equivalent condition in 
which the quantities associated with the subgroups enter symmetrically. For 
this purpose we make use of the equations 





(e) 
(8) BFP) = HG,C1,, AW), 
ab, 


which can easily be derived from (5) of §8. Differentiate (5) in which y = ¢(9, f) 
with respect to p;; this gives 
(9) aH‘§))(¢) dva(g, f) Afa(p) M?, = 0, 

OPa of, 8 Op: 




















822 W. MAYER AND T. Y. THOMAS 


since the right members of (5) are independent of p. Hence 
HBC PAL W)AL AP NAL DN GAL (PMR, = 0 


which reduces to 


(10) CDN YM?) = 
Conversely from (10) we can deduce (9) and (8); hence 
(11) — BCG NMR 


is independent of p. Since for p = p we have f = 4 and therefore g = g the 
expression (11) is seen to be equal to the right member of (5), i.e. (5) is satis- 
fied. Hence, the conditions (10) are necessary and sufficient for the commutivity 
of the elements of the two subgroups %, and As. 

Now consider the characteristic vector spaces of the stadt Y, and %»_ which 
are defined by 


a la a@ 
Ate) = AlsyM°Q?, , Ag = AlsyN {8}. 
Then 
(Aco), A(z)) = M‘2)\N °%)(Acps A(,)) = M‘??, NY; C(8 Ale): 
On account of (10) we have 
(12) (A,, A(g)) = 0. 


Conversely (10) is a consequence of (12). Hence (12) also expresses necessary | 
and sufficient conditions for the commutivity of the elements of %, and >. 

As an immediate consequence of (12) we have the interesting result: The 
elements of two invariants subgroups %, and A» of a Lie group A are commutative 
if their tangent spaces in & have no intersection. 

In particular if %, = %: so that %, is an abelian subgroup then N = M in (10) 
and these equations give 


1(£) yg (8) (8) gly) yr) 
Coq eM (e) = Coy) M2) M (6) = 9, 
i.e. the constants of composition C’ of %, are equal to zero. 


PRINCETON, N. J. 





